Section 01
Safety Context Injection (SCI): A New Inference-Time Safety Alignment Framework for Large Reasoning Models
Safety Context Injection (SCI) is an inference-time safety alignment framework for Large Reasoning Models (LRMs). Its core lies in separating safety assessment from task generation, injecting structured external risk reports into the model's context. It includes two variants: Lightweight Static Filtering (SMF) and Dynamic Agent Analysis (DAF), which effectively reduce the success rate of jailbreak attacks and output toxicity, and mitigate the model's "thinking-output gap" problem.