Section 01
[Introduction] URM-Energy-Stopping: A New Direction for Reasoning Models Using Energy Convergence to Replace ACT
This project explores replacing the Adaptive Computation Time (ACT) mechanism in the Universal Reasoning Model (URM) with an energy-based stopping criterion. The core idea is to use an energy function E(input, output) to score prediction quality and stop iteration when energy converges. Compared to ACT's learned stopping probabilities, this method has advantages such as a principled stopping mechanism, built-in MCMC iterative optimization, and energy scores as a confidence metric.