Section 01
Introduction: Entropy-Cut MH Algorithm—An Efficient New Reasoning Method Based on Decision Point Sampling
This article introduces the Entropy-Cut Metropolis-Hastings (Entropy-Cut MH) algorithm, a new reasoning method based on decision point sampling. The core innovation lies in using next-token entropy to identify key decision points, enabling more efficient power distribution sampling. This algorithm outperforms baseline methods and RL-trained models on multiple reasoning benchmarks, challenging the traditional notion that "reasoning must be acquired through RL training" and revealing that pre-trained models already contain strong reasoning capabilities, providing a new paradigm for optimizing reasoning efficiency. Source: arXiv paper "Reasoning with Sampling: Cutting at Decision Points" (2026-05-28, link: http://arxiv.org/abs/2605.30327v1)