Section 01
[Introduction] Predict-Then-Diffuse Framework: Optimizing Inference Computational Budget for Diffusion Language Models
The research team at the University of Bergamo in Italy proposed the Predict-Then-Diffuse framework, addressing the core issue of diffusion language models (Diffusion LLMs) needing to pre-determine response lengths. By predicting response lengths to optimize inference efficiency, it significantly reduces computational costs while maintaining output quality. The framework adopts the "predict first, diffuse later" approach to solve the resource waste or output truncation problems caused by fixed-length strategies.