Section 01
The "Pseudo-Unification" Dilemma of Unified Multimodal Models: Entropy Probing Reveals the Split in Information Flow Between Vision and Language (Main Thread Introduction)
This paper focuses on the "pseudo-unification" phenomenon of Unified Multimodal Models (UMMs). By analyzing ten representative models using an information-theoretic probing framework, it reveals the dual roots—modal asymmetric encoding and mode-split responses—and points out that true multimodal synergy requires consistency in information flow rather than just parameter sharing.