# RePAIR: Interactive Machine Unlearning, Empowering Users to Control the Knowledge Boundaries of Large Models

> This article introduces the RePAIR framework, which implements a new paradigm of Interactive Machine Unlearning (IMU). Users can instruct the model to forget specific knowledge during inference via natural language commands. The core STAMP method guides MLP activations to a rejection subspace through pseudoinverse updates, enabling efficient, on-device knowledge deletion without retraining.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-14T14:44:45.000Z
- 最近活动: 2026-04-15T01:55:30.274Z
- 热度: 139.8
- 关键词: RePAIR, 机器遗忘, 交互式遗忘, 用户控制, STAMP, 隐私保护, 模型修复, 设备端计算
- 页面链接: https://www.zingnex.cn/en/forum/thread/repair
- Canonical: https://www.zingnex.cn/forum/thread/repair
- Markdown 来源: floors_fallback

---

## RePAIR: Interactive Machine Unlearning, Empowering Users to Control the Knowledge Boundaries of Large Models (Introduction)

This article introduces the RePAIR framework and proposes a new paradigm of Interactive Machine Unlearning (IMU). Users can instruct the model to forget specific knowledge during inference via natural language commands. The core STAMP method guides MLP activations to a rejection subspace through pseudoinverse updates, enabling efficient, on-device knowledge deletion without retraining. This solves the selective unlearning challenge for large models and returns data control to users.

## Background: Memory Dilemmas of Large Models and Limitations of Existing Methods

Large models absorb massive amounts of data during training, easily learning harmful knowledge (e.g., how to make dangerous items), misinformation (pseudoscientific advice), and personal privacy, yet lack a selective unlearning mechanism. Existing machine unlearning methods are provider-centric, requiring retraining or complex post-processing. Ordinary users cannot independently control whether their data is forgotten, leading to privacy and ethical issues.

## Methodology: Interactive Machine Unlearning Paradigm and System Architecture

RePAIR proposes the Interactive Machine Unlearning (IMU) paradigm, where users trigger unlearning in real time via natural language commands. The system consists of three components: a Watchdog model to detect unlearning intent, a Surgeon model to generate repair procedures (identify content to forget, plan steps, generate parameter modification instructions), and a Patient model to execute parameter updates, achieving separation of responsibilities.

## Core Technology: Principles and Advantages of the STAMP Method

STAMP (Steering Through Activation Manipulation with PseudoInverse) is the core technology of RePAIR, featuring no retraining, single-sample operation, and high efficiency. It is based on the observation that model knowledge is encoded in MLP activation patterns. By guiding activations to a rejection subspace via pseudoinverse updates, the model refuses to answer relevant inputs. A low-rank variant reduces computational complexity, completes operations in milliseconds, and supports on-device execution.

## Experimental Validation: Results and Baseline Comparison

RePAIR was tested in three scenarios: 1. Harmful knowledge suppression: Forgetting score approaches 0, while retaining 84.47% of task performance; 2. Misinformation correction: F-RL metric is 0.00, completely forgetting misinformation; 3. Personal data erasure: R-RL metric is 0.88, accurately erasing target data while preserving irrelevant knowledge. Compared with 6 baselines, RePAIR performs best in terms of unlearning completeness, model utility, efficiency, and user control.

## Technical Highlights and Application Scenarios

Technical Highlights: 1. User autonomy without relying on providers; 2. No retraining, millisecond-level unlearning; 3. On-device execution for privacy protection; 4. Extensible to multimodal models. Application Scenarios: Personal privacy protection (GDPR compliance), enterprise data security, real-time fact-checking, and safety compliance.

## Limitations and Future Research Directions

Limitations: The theory of complete unlearning is not fully resolved, and indirect recovery may occur; side effect control is difficult (over/under unlearning); risk of adversarial attacks; interpretability needs improvement. Future Directions: Multimodal unlearning, progressive unlearning, reversible unlearning, and federated unlearning.