# Dropout-GRPO: Introducing Variational Randomness for Continuous Latent Reasoning

> Introduce necessary randomness into latent reasoning models via structured Dropout, enabling GRPO to be applied to continuous latent state models like Coconut, with pass@1 on GSM8K improved from 27.29% to 29.01%

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-06-08T21:21:42.000Z
- 最近活动: 2026-06-10T01:21:37.871Z
- 热度: 0.0
- 关键词: GRPO, 潜在推理, 强化学习, Dropout, Coconut, 变分推断, 推理模型
- 页面链接: https://www.zingnex.cn/en/forum/thread/dropout-grpo
- Canonical: https://www.zingnex.cn/forum/thread/dropout-grpo
- Markdown 来源: floors_fallback

---

## Introduction / Main Floor: Dropout-GRPO: Introducing Variational Randomness for Continuous Latent Reasoning

Introduce necessary randomness into latent reasoning models via structured Dropout, enabling GRPO to be applied to continuous latent state models like Coconut, with pass@1 on GSM8K improved from 27.29% to 29.01%
