Section 01
【Introduction】A Panoramic Survey of Evaluation Benchmarks for Multimodal Large Language Models: Systematic Review of Over 200 Benchmarks and Future Outlook
Title: A Panoramic Survey of Evaluation Benchmarks for Multimodal Large Language Models: Systematic Review of Over 200 Benchmarks and Future Outlook Source: Tencent in collaboration with teams from Peking University, National University of Singapore, Southeast University, and Nanjing University (Original author/maintainer: swordlidev), published on GitHub (Link: https://github.com/swordlidev/Evaluation-Multimodal-LLMs-Survey), release date: 2026-05-26. Core Viewpoint: This paper systematically reviews over 200 evaluation benchmarks for Multimodal Large Language Models (MLLMs), covering five major dimensions: perceptual understanding, cognitive reasoning, domain-specific applications, key capabilities, and multimodal extensions. It provides a comprehensive research framework and directional guidance for the systematic evaluation of MLLMs.