Zing Forum

Reading

AMR Adaptive Multi-Expert Reasoning: Difficulty-Aware Routing Solves the Robustness Problem in Mathematical Reasoning

The AMR framework achieves dynamic strategy adaptation through a difficulty-aware routing system and an uncertainty-guided aggregation mechanism. On the GSM8K dataset, it reaches an accuracy of 75.28% using only original training data, outperforming most 7B models trained with synthetic data.

数学推理多专家系统难度感知不确定性量化自适应采样GSM8K推理聚合
Published 2026-04-12 03:44Recent activity 2026-04-14 09:49Estimated read 5 min
AMR Adaptive Multi-Expert Reasoning: Difficulty-Aware Routing Solves the Robustness Problem in Mathematical Reasoning
1

Section 01

AMR Adaptive Multi-Expert Reasoning Framework: A New Solution to the Robustness Problem in Mathematical Reasoning

The AMR (Adaptive Multi-Expert Reasoning) framework solves the robustness problem in mathematical reasoning by achieving dynamic strategy adaptation through a difficulty-aware routing system and an uncertainty-guided aggregation mechanism. On the GSM8K dataset, it reaches an accuracy of 75.28% using only original training data, outperforming most 7B models trained with synthetic data.

2

Section 02

Dilemmas in Mathematical Reasoning: Limitations of One-Size-Fits-All Strategies and Synthetic Data

Current mainstream mathematical reasoning methods adopt a "one-size-fits-all" strategy, using the same reasoning depth and sampling strategy regardless of problem difficulty, leading to over-reasoning errors in simple problems and insufficient shallow reasoning in difficult ones. Moreover, they rely on synthetic data for training, whose quality distribution deviates from real scenarios, resulting in poor generalization ability.

3

Section 03

Core Architecture of AMR: A Three-Layer Collaborative Dynamic Reasoning System

The core architecture of AMR includes three components: 1. Agile routing system (dual perception of problem difficulty and model uncertainty); 2. Reconfigurable sampling mechanism (dynamically adjusts reasoning breadth and depth); 3. Three-expert collaboration (division of labor among basic, exploration, and verification experts).

4

Section 04

Intelligent Aggregation Mechanism: The Art of Balancing Consensus and Quality

After multiple experts generate candidate answers, AMR uses cluster-driven aggregation: first evaluate the confidence of candidate answers, then cluster similar answers, and finally select the result that balances quality and consensus with weights, avoiding the groupthink trap.

5

Section 05

Experimental Verification: Excellent Performance with Original Data and Component Value

AMR achieves an accuracy of 75.28% on the GSM8K dataset using only original data, outperforming most 7B models relying on synthetic data. Comparative experiments show that removing the difficulty-aware routing or replacing aggregation with simple voting leads to performance degradation, verifying the value of each component.

6

Section 06

Win-Win of Efficiency and Effectiveness: Intelligent Resource Allocation

AMR concentrates computing resources on difficult problems through difficulty perception. The average number of reasoning steps for simple problems is only 1/3 of that for difficult ones, significantly reducing overall reasoning costs while balancing accuracy and efficiency, making it suitable for real-time application scenarios.

7

Section 07

Implications and Prospects: From Mathematical Reasoning to Broader Complex Tasks

The methodology of AMR (difficulty perception, multi-expert collaboration, uncertainty aggregation) can be transferred to complex reasoning tasks such as legal case analysis, medical diagnosis assistance, and scientific research, providing a reference paradigm.

8

Section 08

Conclusion: The Shift in Reasoning Research—From Scale to Strategy

AMR represents the shift of mathematical reasoning research towards refined strategies and intelligent resource allocation. When the marginal benefit of computing power decreases, "calculating smartly" is the key to breaking through bottlenecks, providing a practical technical model for reasoning system developers.