# ARGUS: A Defense Mechanism for Context-Aware Prompt Injection Attacks on LLM Agents

> The study proposes the AgentLure benchmark and ARGUS defense system, addressing the limitation of existing defenses that ignore context-dependent tasks. By constructing an influence provenance graph to track the propagation of untrusted context, it reduces the attack success rate to 3.8% while maintaining 87.5% task utility, significantly outperforming existing defense methods.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-05T05:37:00.000Z
- 最近活动: 2026-05-06T02:36:17.476Z
- 热度: 121.0
- 关键词: LLM智能体, 提示注入攻击, 安全防护, 溯源追踪, 上下文感知, Agent安全, 决策审计, 对抗鲁棒性
- 页面链接: https://www.zingnex.cn/en/forum/thread/argus-llm
- Canonical: https://www.zingnex.cn/forum/thread/argus-llm
- Markdown 来源: floors_fallback

---

## ARGUS: A Guide to the Defense Mechanism Against Context-Aware Prompt Injection on LLM Agents

This paper proposes the AgentLure benchmark and ARGUS defense system, addressing the limitation of existing defenses that ignore context-dependent tasks. By constructing an influence provenance graph to track the propagation of untrusted context, it reduces the attack success rate to 3.8% while maintaining 87.5% task utility, significantly outperforming existing defense methods and providing a new path for the security protection of LLM agents.

## Security Challenges of LLM Agents and Shortcomings of Existing Defenses

The expanded capabilities of LLM agents bring the risk of prompt injection attacks: attackers embed malicious instructions to induce unintended operations. Most existing studies assume context independence, which is far from real-world scenarios (dynamic dependencies, adaptive attacks, multi-step execution). Existing defenses (input filtering, prompt separation, instruction hierarchy, output monitoring) fail due to being stateless and not tracking information propagation.

## AgentLure: A Benchmark for Evaluating Context-Aware Attacks

AgentLure is the first evaluation framework for context-dependent tasks, covering 4 major domains (personal assistant, code assistant, data analysis, web browsing) and 8 attack vectors (direct/indirect/multi-step/disguised/obfuscated/context hijacking/tool return/observation injection). The attacks have context-aware characteristics (based on the agent's existing tasks, disguised as legitimate content, leveraging links in the reasoning chain).

## ARGUS Defense Mechanism: Provenance-Aware Decision Auditing

The core of ARGUS is the influence provenance graph, which records information sources, trust levels, propagation paths, and decision dependencies. Three-stage process: 1. Context tagging and tracking (attaching provenance metadata); 2. Decision impact analysis (identifying dependent information, constructing paths, calculating weighted impact); 3. Evidence verification and arbitration (executing based on trusted evidence or triggering protection). Technical highlights: lightweight integration, adaptive trust model, adversarial robustness.

## Experimental Evaluation: Defense Effectiveness and Robustness of ARGUS

AgentLure tests show: ARGUS reduces the attack success rate to 3.8% (baseline 52.3%) while maintaining 87.5% task utility, significantly outperforming existing defenses. Under adaptive white-box attacks, the success rate remains below 8.2%, and ablation experiments verify that the synergy between the provenance and verification components is critical.

## Conclusions and Future Research Directions

ARGUS balances security and usability, providing a feasible path for trustworthy agent systems. Limitations include computational overhead, complexity of long reasoning chains, and response to new attacks. Future directions: efficient provenance algorithms, machine learning for identifying suspicious patterns, multi-agent security mechanisms, and standardized evaluation frameworks.