Zing Forum

Reading

Dropout-GRPO: Introducing Variational Randomness for Continuous Latent Reasoning

Introduce necessary randomness into latent reasoning models via structured Dropout, enabling GRPO to be applied to continuous latent state models like Coconut, with pass@1 on GSM8K improved from 27.29% to 29.01%

GRPO潜在推理强化学习DropoutCoconut变分推断推理模型
Published 2026-06-09 05:21Recent activity 2026-06-10 09:21Estimated read 1 min
Dropout-GRPO: Introducing Variational Randomness for Continuous Latent Reasoning
1

Section 01

导读 / 主楼:Dropout-GRPO: Introducing Variational Randomness for Continuous Latent Reasoning

Introduction / Main Floor: Dropout-GRPO: Introducing Variational Randomness for Continuous Latent Reasoning

Introduce necessary randomness into latent reasoning models via structured Dropout, enabling GRPO to be applied to continuous latent state models like Coconut, with pass@1 on GSM8K improved from 27.29% to 29.01%