Section 01
[Main Floor/Introduction] Berth: A Unified Multi-Backend Control Plane for Large Model Inference
Berth is a single-node inference control plane that provides an OpenAI-compatible API and supports multiple inference backends such as vLLM, SGLang, and TensorRT-LLM. It aims to address the challenges of choice difficulty and management complexity caused by backend fragmentation in large model inference deployment, simplifying the deployment and management processes.