Section 01
RBG: An LLM Inference Service Orchestration Framework for Kubernetes (Introduction)
RBG (RoleBasedGroup) is a Kubernetes API specifically designed for orchestrating distributed, stateful AI inference workloads. It supports multi-role collaboration and built-in service discovery, making it particularly suitable for production deployment of decoupled architectures such as Prefill/Decode separation. Through role-based organizational abstraction, it addresses the limitations of traditional Kubernetes primitives in multi-role topology management, hardware topology sensitivity, and lack of atomic operations, providing a unified orchestration view and efficient collaboration capabilities for LLM inference services.