Section 01
【Introduction】DailyClue: A New Benchmark for Visual Reasoning of Multimodal Large Models in Daily Scenarios
The Chinese University of Hong Kong, Shanghai AI Lab, and other institutions jointly proposed the DailyClue benchmark, which is specifically designed to evaluate the visual clue-driven reasoning capabilities of multimodal large language models (MLLMs) in daily scenarios. This benchmark covers four major daily domains and 16 sub-tasks, requiring models to actively identify key visual clues and perform reasoning instead of simple object recognition, thus filling the gap in existing evaluations that lack sufficient focus on reasoning capabilities.