Section 01
VideoRouter Core Guide: Dual-Route Framework Solves Long Video Token Crisis with 67.9% Token Reduction
Long video understanding faces a scalability bottleneck due to the explosion of visual token sequences. VideoRouter uses a dual-route mechanism (semantic routing and image routing) to adaptively allocate visual token budgets based on queries. It preserves high-resolution details in key evidence frames while aggressively compressing irrelevant frames, achieving up to 67.9% token reduction on benchmarks like VideoMME while maintaining or even improving understanding accuracy.