Zing Forum

Reading

AutoTTS: An Intelligent Framework for Automated Discovery of Test-Time Scaling Strategies

AutoTTS is an innovative environment-driven framework that automatically discovers test-time scaling strategies for large language models via evolutionary algorithms. It leverages Beta parameterization and low-cost feedback loops to synthesize controllers, significantly improving model inference efficiency.

测试时缩放大语言模型自动策略发现进化算法推理优化机器学习
Published 2026-05-13 12:37Recent activity 2026-05-13 12:52Estimated read 6 min
AutoTTS: An Intelligent Framework for Automated Discovery of Test-Time Scaling Strategies
1

Section 01

AutoTTS Framework Overview: Automated Discovery of Test-Time Scaling Strategies for Large Language Models

AutoTTS is an innovative environment-driven framework designed to address the limitation of traditional Test-Time Scaling (TTS) strategies that rely on manual design. It automatically discovers test-time scaling strategies for large language models using evolutionary algorithms, synthesizes controllers through Beta parameterization and low-cost feedback loops, significantly improves model inference efficiency, and exhibits cross-task generalization capabilities.

2

Section 02

Research Background and Challenges

Test-Time Scaling (TTS) is a key direction for enhancing the inference capabilities of large language models. Traditional methods rely on manually designed heuristic rules to decide the branching, continuation, or termination of inference paths, but they have limitations: different tasks require different strategies, and fixed rules struggle to adapt to model evolution. AutoTTS adopts an environment-driven automated discovery mechanism, automatically learning optimized strategies by iteratively collecting inference trajectories and low-cost feedback.

3

Section 03

Core Technical Innovations

Controller Synthesis Mechanism

The core of the framework is an intelligent controller that supports five operations: branching, continuation, probing, pruning, and stopping. It uses a hybrid architecture of a policy network and a rule engine, balancing flexibility and interpretability.

Beta Parameterization Method

It converts the exploration-exploitation trade-off into learnable parameters: extensive exploration in the early stage and fine optimization focusing on regions of excellent strategies in the later stage, enabling efficient strategy search.

Low-Cost Feedback Mechanism

It uses a trajectory-based scoring mechanism to evaluate strategy quality without additional model calls, reducing evaluation costs by several orders of magnitude and supporting large-scale strategy search.

4

Section 04

System Architecture and Workflow

AutoTTS consists of four core components:

Discovery Engine

It uses evolutionary algorithms to maintain a population of strategies, generates new strategies through mutation, crossover, and selection, evaluates fitness using low-cost feedback, and retains excellent strategies for evolution.

Environment Module

It simulates inference scenarios to collect trajectory data, provides standardized interfaces to adapt to different tasks and models, and supports parallel evaluation of multiple strategy candidates.

Executor Component

It implements strategy serialization/deserialization, supports persistent storage and cross-scenario reuse, and provides efficient inference interfaces to meet production deployment requirements.

5

Section 05

Application Value and Experimental Results

In validation across multiple inference tasks, the strategies automatically discovered by AutoTTS outperform manual baselines in both accuracy and efficiency. The strategies have good generalization capabilities and can be transferred to related tasks, reducing marginal development costs. For developers, they only need to provide domain samples, and the framework can automatically discover suitable TTS strategies, quickly transforming a general model into a domain-optimized model.

6

Section 06

Code Implementation and Future Directions

Code Implementation

The project provides a complete Python implementation. Core modules include controller.py (controller logic), environment.py (feedback evaluation), discovery.py (evolutionary search), and executor.py (strategy application). The documentation is comprehensive, and usage examples are concise.

Future Directions

The team is exploring extension directions such as multi-modal inference strategy discovery, reinforcement learning-based online optimization, and strategy combination and reuse mechanisms to advance the development of automated TTS technology.