Section 01
[Introduction] Deep Analysis of Performance Degradation in Image Classification by Medical Multimodal Large Models
This article systematically analyzes 14 open-source medical multimodal large models using feature probe technology, revealing four major failure modes leading to performance degradation in medical image classification tasks, and provides important warnings for the clinical implementation of medical AI. The study found that although medical MLLMs are highly anticipated, their performance in image classification tasks lags behind traditional models, and the performance degradation stems from multi-level issues such as visual representation, cross-modal connection, language reasoning, and semantic mapping.