Section 01
Introduction: MissRAG Framework—An Innovative Solution to Missing Modality Issues in Multimodal Large Models
The MissRAG framework, accepted by ICCV 2025, is the first to apply RAG technology to address missing modality issues in multimodal large language models (MLLMs). It supports arbitrary combination retrieval and generation of three modalities: audio, visual, and text. By leveraging intelligent retrieval and prompt engineering, it enhances the robustness of existing models without modifying their architecture or retraining them.