Section 01
[Introduction] ClinHallu: A Phased Benchmark for Hallucination Diagnosis in Medical Multimodal Large Models
ClinHallu is a phased hallucination diagnosis benchmark for medical multimodal large language models (MLLMs). Using 7,031 validation instances and structured reasoning tracking, it precisely locates the specific stages where hallucinations occur (visual recognition, knowledge recall, reasoning integration), providing a fine-grained testing tool for evaluating the credibility and safety of medical AI systems. It has been open-sourced.