Section 01
Introduction to the AVR Framework: Adaptive Path Learning to Alleviate Overthinking in Visual Reasoning
AVR (Adaptive Reasoning Path Learning Framework for Efficient Visual Reasoning) decomposes visual reasoning into three cognitive functions—perception, logical reasoning, and answer application—allowing the model to dynamically select the simplest response format. It reduces token usage by 50-90% while maintaining accuracy, effectively addressing the overthinking problem in visual reasoning models.