Section 01
Introduction to Research on Steganographic Behavior in Reasoning Models
This article explores whether reinforcement learning (RL) training induces reasoning models to develop steganography capabilities, revealing new hidden risks in the field of AI safety. The study focuses on the emergence mechanism, boundaries, and risks of steganographic reasoning in models under RL training, providing important references for AI safety and alignment research.