Section 01
[Introduction] Med_Benchmarks_LLMs: An Automated Benchmark Framework for Medical LLM Evaluation
Med_Benchmarks_LLMs is an automated benchmarking framework for evaluating medical large language models, designed to address the fragmentation issue in medical AI evaluation. It systematically collects medical benchmark data (covering text and multimodal categories) from Hugging Face and GitHub, processes it in a structured manner, provides a reliable basis for model selection in clinical scenarios, and lowers the threshold for researchers to access and use benchmark resources.