Section 01
【Introduction】ROSE Framework: A New Paradigm for Image Segmentation Enabling Real-Time Knowledge Retrieval in Multimodal Large Models
To address the issue that multimodal large language models (MLLMs) cannot recognize emerging entities in image segmentation tasks, researchers propose the ROSE (Retrieval-Oriented Segmentation Enhancement) framework, which injects real-time web knowledge using retrieval-augmented generation technology. This framework achieves a 19.2 gIoU performance improvement on the Novel Emerging Segmentation Task (NEST) benchmark, providing a new paradigm for solving the limitations of static knowledge bases and enabling dynamic knowledge acquisition.