Section 01
[Introduction] Hidden Risks in Reasoning Chains and Adaptive Intervention Solutions
This article reveals hidden security risks in the reasoning chains of large reasoning models (even if the final answer is safe, the reasoning process may be harmful), proposes an adaptive multi-principle guidance method, and achieves a 40.8% reduction in unsafe content while maintaining 97.7% accuracy on DeepSeek-R1-Qwen-7B. The study emphasizes the need for full-link security assessment of both the reasoning process and final output.