Section 01
Introduction: IMUG-Bench—A New Evaluation Benchmark for Interleaved Text-Image Dialogue Capabilities of Unified Multimodal Models
Core Insights: IMUG-Bench is the first evaluation benchmark to systematically assess the performance of unified multimodal models (UMMs) in multi-turn interleaved text-image dialogues. It reveals that mainstream models have significant exposure bias on the generation side and verifies the effectiveness of test-time scaling strategies.
Source Information:
- Original authors: arXiv paper team
- Source platform: arXiv
- Publication time: June 8, 2026
- Original link: http://arxiv.org/abs/2606.09169v1
This benchmark fills the gap in existing evaluations for dynamic multi-turn interaction scenarios and provides key references for the development of UMMs.