Zing Forum

Reading

Spore Attack: A New Threat of Efficient Privacy Extraction Targeting LLM Intelligent Agent Memory

The research team proposes the Spore attack method, which can extract privacy information from the memory of LLM intelligent agents with a single query, bypass existing defense mechanisms, and pose a new security threat to users of personal AI assistants.

隐私攻击LLM智能代理记忆安全黑盒攻击隐私提取安全对齐
Published 2026-04-26 21:54Recent activity 2026-04-28 10:02Estimated read 6 min
Spore Attack: A New Threat of Efficient Privacy Extraction Targeting LLM Intelligent Agent Memory
1

Section 01

Introduction: Spore Attack—A New Threat of Efficient Privacy Extraction Targeting LLM Intelligent Agent Memory

The research team proposes the Spore attack method, which can extract privacy information from the memory of LLM intelligent agents with a single query, bypass existing defense mechanisms, and pose a new security threat to users of personal AI assistants. This attack fills the gap in existing research on contextual privacy risks during the inference phase (especially user interaction information in agent memory) and overcomes limitations of traditional attacks such as high query costs and white-box assumptions.

2

Section 02

Background: Privacy Concerns of LLM Intelligent Agents and Limitations of Existing Attacks

With the popularity of personal AI assistants like OpenClaw, LLM intelligent agents store sensitive information such as user preferences, health, and financial data to provide personalized services, but their memory capabilities bring privacy risks. Existing privacy attack research mostly focuses on training data leakage, with insufficient attention to privacy risks in agent memory during the inference phase. Traditional attacks have limitations such as high query costs, reliance on white-box access, and need for specific training, making it difficult to pose actual threats to real systems.

3

Section 03

Methodology: Design and Core Features of the Spore Attack

The Spore attack is a training-free privacy extraction method targeting LLM agent memory systems. Its core innovation is a hybrid probing strategy that supports two modes:

  • Black-box mode: Only observes the model's final output, recovers privacy data by extracting candidate information sets with a single query, reducing attack visibility;
  • Gray-box mode: Uses token probability distribution information from the model's output to achieve more accurate and fast extraction. Information-theoretic analysis shows that Spore leaks a large amount of information per query and maintains stable robustness across different model sizes and architectures.
4

Section 04

Experimental Validation: Effectiveness and Defense Bypassing Capability of the Spore Attack

Experiments on mainstream LLMs such as GPT-4, Claude, and Gemini show that the Spore attack's success rate consistently surpasses existing SOTA solutions, with extremely low query costs (single query in black-box mode). It has strong cross-model stability and is not affected by the parameter size of the target model. In addition, Spore can bypass traditional anomaly detection systems, safety alignment mechanisms, and various defenses (such as input filtering, output monitoring, and adversarial training), revealing the serious deficiencies in current L LLM agent privacy protection.

5

Section 05

Security Implications and Defense Recommendations

The discovery of the Spore attack has important implications for LLM agent system design: it is necessary to rethink memory management (such as access control and forgetting mechanisms for sensitive information), define privacy boundaries, and upgrade defense mechanisms. Specific defense recommendations include:

  • Principle of minimal memory: Only retain task-essential information and promptly clear temporary sensitive data;
  • Introduce differential privacy technology;
  • Strengthen memory access control;
  • Establish continuous monitoring and auditing mechanisms.
6

Section 06

Ethical Considerations and Future Research Directions

The research team follows the principle of responsible disclosure, providing defense measures while disclosing the attack method. Future research directions include: developing specialized defense methods for privacy leakage during the inference phase, establishing LLM agent privacy security evaluation benchmarks, exploring balance strategies between privacy protection and model utility, and studying the impact of distributed architectures such as federated learning on privacy security.