Section 01
Introduction: ClinDEF—A Dynamic Evaluation Framework for LLMs in Clinical Reasoning
ClinDEF is a dynamic evaluation framework specifically designed to assess the performance of large language models (LLMs) in clinical reasoning tasks. By simulating real clinical scenarios, using multi-dimensional metrics, and adopting an interactive process, it addresses the problem that traditional benchmark tests overlook the complexity of clinical reasoning, aiming to comprehensively test models' medical reasoning capabilities.