Section 01
Introduction: DynaMO-RL—An Efficient Optimization Framework for RL Training of Large Language Models
DynaMO-RL is a reinforcement learning optimization framework for large language models. Its core lies in two mechanisms: dynamic rollout resource allocation and advantage function modulation. It reduces computational overhead while improving policy learning performance, providing a more efficient solution for RL training of LLMs.