Section 01
[Introduction] Core Findings of the Evaluation of Large Language Models' Small Molecule Drug Design Capabilities
This paper constructs ChemRL, a drug design task benchmark based on chemical principles, and formalizes it as a reinforcement learning (RL) environment to evaluate the small molecule drug design capabilities of cutting-edge large language models (LLMs). The study finds: Cutting-edge models are becoming increasingly proficient in chemical tasks, but there is still room for improvement in low-data scenarios; crucially, RL-based post-training can significantly improve performance, enabling smaller models to reach the level of cutting-edge models.