Section 01
DR-MMSearchAgent: A New Approach to Solving Premature Interaction Collapse in Multimodal Search Agents
DR-MMSearchAgent addresses the premature interaction collapse problem of multimodal search agents by proposing two innovative mechanisms: trajectory-level advantage estimation based on structural proximity and dynamic calibration of differential Gaussian rewards. These effectively incentivize agents to fully explore information, outperforming the baseline MMSearch-R1 by 8.4% on FVQA-test and significantly enhancing the reasoning capabilities of multimodal search agents.