Section 01
TENT: Declarative Data Flow Engine for Decoupled LLM Services (Introduction)
Modern GPU clusters adopt heterogeneous interconnection networks, where traditional static path selection leads to head-of-line blocking and bandwidth waste. TENT decouples transmission intent from physical execution, unifies heterogeneous interconnections into a dynamic resource pool, and achieves fault self-healing within 50ms by combining fine-grained slice spraying and telemetry-driven scheduling. On H800 clusters, TENT achieves a 1.36x throughput increase and 26% latency reduction compared to existing solutions, providing a high-performance data transmission solution for decoupled LLM services.