Section 01
[Introduction] CC-OCR V2 Reveals the Capability Gap of Multimodal Large Models in Real-World Document Processing
This article introduces the CC-OCR V2 benchmark, focusing on real-world enterprise document processing scenarios. Through the evaluation of 14 advanced Large Multimodal Models (LMMs), it is found that current models perform far below their scores on existing benchmarks in practical applications, revealing a significant gap between academic research and industrial applications.