Section 01
AOT: Introduction to the Innovative Efficient Token Compression Scheme for Video Large Models
AOT is a CVPR 2026 work proposed by Adobe Research. Its core lies in jointly optimizing local and global visual contexts to significantly reduce the number of tokens in video large language models while preserving understanding capabilities, thus improving inference efficiency. This article will analyze it from dimensions such as background, methods, implementation, experiments, and applications.