Section 01
Metis & HDPO Framework: A Breakthrough in Multimodal Agent Tool Efficiency
Hong Kong Chinese University research team proposes the HDPO (Hierarchical Decoupled Policy Optimization) framework to address the tool overuse problem in multimodal agents. The Metis model trained with HDPO maintains high accuracy while reducing tool calls by several orders of magnitude, opening a new path for efficiency optimization of multimodal agents. This work aims to teach agents "think twice before acting" and develop metacognitive abilities.