Section 01
[Introduction] DeepThinkVLA: An Innovative Framework for Endowing VLA Models with Explicit Reasoning Capabilities
Developed by the OpenBMB team, DeepThinkVLA addresses the lack of explicit reasoning in existing Visual-Language-Action (VLA) models through a hybrid attention decoder and explicit Chain-of-Thought (CoT) mechanism, significantly improving decision quality and task success rates. This framework achieves an average success rate of 97% on the LIBERO benchmark, providing an interpretable and highly robust solution for the field of embodied intelligence.