Section 01
[Introduction] ELTLM-Bench: A New Benchmark for Evaluating Temporal Multimodal Large Models in Healthcare
ELTLM-Bench is the first comprehensive benchmark focusing on evaluating the time perception and reasoning capabilities of large language multimodal models in healthcare longitudinal temporal scenarios, and it has been accepted by ACL 2026 Findings. This benchmark fills the gap of traditional static evaluations that ignore the temporal dimension in healthcare, provides high-quality temporal datasets and a hierarchical evaluation system, and reveals key limitations of current SOTA models in temporal understanding, serving as an important evaluation tool for the development of healthcare AI.