Section 01
MuDABench: A New Large-Scale Document Analysis QA Benchmark Revealing RAG System Bottlenecks
MuDABench is a new analytical QA benchmark for large-scale semi-structured document collections, including 80,000 pages of documents and 332 analytical QA instances. It aims to fill the gap in existing multi-document QA benchmarks regarding cross-document reasoning requirements. Through this benchmark, the study reveals the bottlenecks of standard RAG systems and proposes optimization directions such as multi-agent workflows, providing guidance for the design of next-generation RAG systems.