Section 01
[Introduction] New Protocol for Evaluating Multimodal Inverse Problems: Addressing the Misleading Nature of Pointwise Metrics
This article addresses the misleading issue of traditional pointwise metrics (e.g., Mean Squared Error, MSE) in the evaluation of multimodal inverse problems and proposes a more reliable evaluation protocol. Using di-lepton top quark neutrino reconstruction as a benchmark task, the study compares the performance of various generative models including regression transformers, discrete normalizing flows, and continuous normalizing flows. Key findings: Pointwise metrics tend to favor point-estimation models, while generative models better capture the true multimodal distribution, providing critical guidance for machine learning model selection in particle physics.