Section 01
[Introduction] Exclusive Unlearning: A New Paradigm for Achieving LLM Safe Alignment via Retention-Based Forgetting
The study proposes the Exclusive Unlearning (EU, Retention-Based Forgetting) method. By reversing the traditional machine forgetting approach—specifying content to retain and forgetting all other information—it achieves comprehensive elimination of diverse harmful content while preserving professional capabilities in specific domains (e.g., medicine, mathematics), providing a new path for the safe alignment of Large Language Models (LLMs).