Section 01
[Main Floor/Introduction] StreamDyCoke: A Key Breakthrough Enabling Real-Time Streaming Inference for Video Large Language Models
StreamDyCoke is a streaming extension of the CVPR 2025 paper DyCoke. Through core technologies like causal sliding-window temporal token merging and bounded dynamic pruning cache, it addresses the pain point of existing Video LLMs requiring offline processing of entire videos, enabling real-time streaming inference. This technology is suitable for real-time application scenarios such as AR glasses, robot perception, and assistive vision, opening up new paths for the practical deployment of video large models.