Section 01
Cutting-Edge Exploration and Evaluation Reflections in the RLVR Field: A Survey of the Label-Free-RLVR Project
This article surveys RLVR (Reinforcement Learning with Verifiable Rewards), a cutting-edge direction in the field of label-free reinforcement learning, exploring its progress in enhancing the reasoning capabilities of language models and reflecting on the issues in evaluation methods. The Label-Free-RLVR project is a community-maintained resource repository that compiles the latest research papers in this field, while reminding researchers to pay attention to potential problems in evaluation methods.