Section 01
Core Introduction to the LESS Method
To address the low sampling efficiency of diffusion large language models (dLLMs), LESS proposes a mutually stable adaptive sampling strategy that dynamically determines the token demasking timing using joint stability rules. This method achieves a 72.1% reduction in reverse steps on models such as Dream-7B and LLaDA-8B, while maintaining or improving average accuracy, and significantly reducing inference latency and computational costs.