Zing Forum

Reading

ANTS: Adaptive Nucleus Truncation Sampling Method for Long-Text Reasoning

This article introduces ANTS (Adaptive Nucleus Truncation Sampling), a new method that transforms fixed decoding rules into an adaptive generation control mechanism. It dynamically adjusts the truncation width via an entropy condition controller, significantly improving performance in long-text reasoning tasks.

采样策略长文本推理自适应截断核采样熵控制解码优化推理稳定性ANTS
Published 2026-06-12 08:02Recent activity 2026-06-15 11:53Estimated read 6 min
ANTS: Adaptive Nucleus Truncation Sampling Method for Long-Text Reasoning
1

Section 01

ANTS: Adaptive Nucleus Truncation Sampling for Long-Form Reasoning (Main Thread)

Core Overview

ANTS (Adaptive Nucleus Truncation Sampling) is a new method that transforms fixed decoding rules into an adaptive generation control mechanism. It dynamically adjusts truncation width via an entropy condition controller, significantly improving performance in long-text reasoning tasks.

Basic Source Info

2

Section 02

Background: Sampling Challenges & Limitations of Fixed Threshold Methods

Key Role of Sampling in Long-Text Reasoning

Unlike short-text generation, long-text reasoning involves thousands of decoding steps. Minor changes in candidate token sets accumulate over time, leading to distinct reasoning trajectories and stability differences.

Limitations of Existing Methods

Mainstream methods (top-p, min-p, fixed top-nσ) rely on fixed thresholds, which fail to adapt to:

  1. Entropy changes in model output distribution
  2. Task difficulty variations
  3. Training stage evolution
  4. Generation budget constraints This rigidity limits performance improvement.
3

Section 03

ANTS Core Design: Adaptive Truncation Mechanisms

Standardized Neighborhood Selection

  1. Identify the maximum logit in the probability distribution
  2. Build a standardized candidate token set around this logit
  3. Perform truncation before temperature scaling to preserve original distribution characteristics

Entropy Condition Controller

  • Uses entropy as an uncertainty indicator (high entropy = wider truncation, low entropy = narrower truncation)
  • Dynamically adjusts truncation width via entropy-width mapping and smooth transitions

No-Truncation Fallback Mechanism

Reserved for unstable training or abnormal distribution scenarios to ensure training safety.

4

Section 04

Experimental Results: Performance Gains Across Tasks

Overall Performance

Tests on a 33B MoE model show increasing gains with longer generation lengths:

Generation Length Performance Gain
8K tokens +1.9 points
16K tokens +3.8 points
32K tokens +5.2 points

Task-Specific Results

  • Instruction Following (IFBench): +10 points at 32K length (improves structure consistency and long-range dependencies)
  • Math Reasoning (AIME 2025): +7 points (reduces error accumulation)
  • Code Generation (Codeforces): Outperforms baseline at 16K/32K lengths (benefits complex code generation)
5

Section 05

Technical Contributions & New Perspectives

Paradigm Shift in Sampler Design

Samplers should be treated as intrinsic components for stabilizing long-budget reasoning, not just fixed hyperparameters.

Value of Adaptive Mechanisms

  • State-aware: Adjusts based on internal model states (e.g., entropy)
  • Context-adaptive: Optimizes for current reasoning context
  • Robust: Enhances model adaptability to diverse scenarios

Optimization Directions

  1. Fine-grained token-level control
  2. Multi-objective optimization (quality, diversity, efficiency)
  3. Learning-based sampling strategy optimization
6

Section 06

Practical Application Scenarios

Long Document Generation

  • Maintains coherence and structural quality
  • Reduces deviation and repetition

Complex Reasoning Tasks

  • Stabilizes reasoning chains
  • Improves intermediate step quality and final answer accuracy

Dialogue Systems

  • Preserves context coherence in long conversations
  • Generates more natural responses
7

Section 07

Summary & Future Outlook

Summary

ANTS introduces an adaptive nucleus truncation mechanism, shifting sampling from fixed hyperparameters to adaptive control. It achieves significant performance gains in long-text reasoning.

Future Directions

  1. Integrate more state indicators (e.g., attention patterns, inter-layer consistency)
  2. Design task-specific adaptive strategies
  3. Incorporate sampling strategy learning into model training
  4. Extend to multi-modal generation scenarios