Section 01
FALCON: Solving Core Issues in High-Resolution Multimodal Large Models Using Visual Registers
FALCON is a joint work by HIT Shenzhen and Huawei Noah's Ark Lab accepted by ICCV 2025. It addresses two core issues—visual redundancy and fragmentation—in high-resolution multimodal large language models through an innovative Visual Register technique, achieving a balance between elastic efficiency and robust perception. The complete code and pre-trained models of this work have been open-sourced.