# MARS: A Margin-Adversarial Risk-Controlled Early Stopping Strategy

> MARS monitors the aggregated voting dynamics at intermediate checkpoints to predict which reasoning trajectories might change the answer, saving 25-47% of computation tokens while ensuring accuracy.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-06-11T05:56:42.000Z
- 最近活动: 2026-06-12T01:25:11.826Z
- 热度: 120.5
- 关键词: 测试时扩展, 早停策略, 推理优化, 多数投票, MARS, 计算效率, LLM推理
- 页面链接: https://www.zingnex.cn/en/forum/thread/mars
- Canonical: https://www.zingnex.cn/forum/thread/mars
- Markdown 来源: floors_fallback

---

## 【Introduction】MARS: Margin-Adversarial Risk-Controlled Early Stopping Strategy, Saving 25-47% Computation Tokens Without Accuracy Loss

MARS (Margin-Adversarial Risk-controlled Stopping) is a research result published on arXiv on June 11, 2026. Addressing the computational overhead issue in parallel inference-time expansion, it monitors the aggregated voting dynamics at intermediate checkpoints to predict reasoning trajectories that might change the answer. By adopting a margin-adversarial stopping rule, it saves 25-47% of computation tokens while ensuring accuracy. The core is to separate two types of uncertainties: trajectory-level switching probability and adversarial boundary, enabling risk-controlled early stopping.

## 【Background】Computational Dilemma of Parallel Inference-Time Expansion

Inference-time expansion improves LLM reasoning ability by sampling a large number of reasoning trajectories and using majority voting, but all trajectories need to run to completion, leading to huge computational overhead. The research team observed that intermediate checkpoints can extract the current answer, and the aggregated voting pattern evolves as reasoning progresses, raising the question: Can irrelevant trajectories be terminated early while maintaining accuracy?

## 【Methodology】Core Ideas and Implementation of MARS

MARS introduces a margin-adversarial stopping rule to estimate the possibility that an active trajectory will change the answer, stopping generation when the leading answer is safe. The key is to separate two types of uncertainties: 1. Trajectory-level switching probability (predicting the probability that a trajectory will change the answer later); 2. Adversarial boundary (conservatively estimating the direction of answer change). In practice, a five-feature logistic regression model (features include voting margin, trajectory confidence, etc.) is used, which has the advantages of low overhead, interpretability, and good generalization.

## 【Experiments】Significant Computational Saving Effects

In evaluations using three reasoning models and three competition math benchmarks, MARS performed excellently: it saved 25-47% of tokens compared to standard self-consistency without loss of accuracy; it further saved 14-29% of tokens compared to the advanced baseline DeepConf Online (which already filters weak trajectories), proving the effectiveness and complementarity of the method.

## 【Conclusion】Technical Contributions and Theoretical Guarantees

MARS not only has practical effects but also provides a structured analysis framework: separating two sources of uncertainty. Theoretically, when the switching probability is accurate, it is highly probable that the early-stopped answer is consistent with the full voting result. Its risk control feature is suitable for accuracy-sensitive scenarios; the adversarial boundary design considers the worst-case scenario, improving robustness.

## 【Applications and Limitations】Applicable Scenarios and Future Directions

MARS is applicable to all parallel inference-time expansion scenarios (mathematical problem solving, code generation, etc.). Limitations: Currently, it targets the majority voting aggregation strategy; other aggregation methods require adjustments; it relies on the accuracy of the switching probability model, and out-of-distribution scenarios need recalibration. It is still an important progress in efficiency optimization.