Section 01
DELMAN: A Novel Approach to Dynamically Defend LLM Against Jailbreaking Attacks (Introduction)
The Tsinghua University team proposed the DELMAN method, which uses model editing technology to dynamically defend large language models (LLMs) against jailbreaking attacks. This work has been accepted by ACL 2025 Findings. DELMAN can effectively resist various jailbreaking attacks while maintaining the model's normal performance, providing a new path for LLM security defense.