Section 01
Introduction: TEMPO—An EM Algorithm Innovation to Solve Test-Time Training Bottlenecks
TEMPO formalizes Test-Time Training (TTT) as an Expectation-Maximization (EM) algorithm. Through alternating iterations of policy optimization and critic recalibration, it addresses the bottleneck where existing TTT methods quickly hit a plateau after initial performance gains. This method has achieved significant breakthroughs in mathematical reasoning tasks like AIME 2024, providing a new paradigm for continuously expanding model capabilities during the inference phase.