Section 01
VisionWeaver: Addressing Hallucination in Multimodal Large Models from the Visual Encoder Perspective (Main Floor Introduction)
VisionWeaver, a study accepted by EMNLP 2025 Findings, proposes to alleviate object hallucination in large vision-language models by dynamically aggregating features from multiple specialized visual encoders, and releases the VHBench-10 fine-grained evaluation benchmark as a companion. The core idea is to optimize from the source of visual feature extraction, using a multi-expert architecture and dynamic routing mechanism to reduce hallucinations.