Zing Forum

Reading

AgentGuard: Building a Programmable Security Firewall for AI Agent Wallets

An in-depth analysis of the AgentGuard project, exploring how to establish an on-chain security layer for autonomous AI wallets via Somnia Agents' consensus verification mechanism, enabling pre-action review and risk control of fund operations.

AI代理区块链安全智能合约SomniaLLM共识验证加密钱包DeFi风险控制Web3
Published 2026-06-10 19:43Recent activity 2026-06-10 19:51Estimated read 7 min
AgentGuard: Building a Programmable Security Firewall for AI Agent Wallets
1

Section 01

AgentGuard: A Programmable Security Firewall for AI Agent Wallets

The AgentGuard project aims to build an on-chain programmable security layer for autonomous AI wallets. It achieves pre-action review and risk control of fund operations through Somnia Agents' consensus verification mechanism, addressing the fund security challenges posed by AI agent autonomy. Its core design philosophy is 'Agent Proposes, Somnia Agents Review, Vault Enforces', combining the 'Human-in-the-Loop' and 'Code is Law' paradigms to balance AI autonomy and safety guardrails.

2

Section 02

Background: New Challenges in AI Agent Fund Security

As LLM capabilities advance, AI agents are gradually gaining autonomous permissions to manage crypto wallets and execute transactions, but this also brings security risks: prompt injection attacks, model hallucinations, tool call errors, or private key leaks may lead to fund theft. Traditional security models relying on AI's own judgment are no longer sufficient, so AgentGuard was born to insert a review checkpoint between AI intent and fund transfer.

3

Section 03

Core Architecture: Three-Layer Protection Security Model

AgentGuard adopts a three-layer protection design:

  1. AI Agent Layer: Agents do not directly execute fund operations; they only submit action proposals (including target address, amount, reason, etc.) via proposeAction, separating decision-making from execution.
  2. Somnia Agents Review Layer: A core innovation. Consensus verification agents (using LLMs like Qwen3-30B) analyze proposals: first, the Parse-Website Agent extracts security signals from evidence URLs, then the LLM Inference Agent makes an APPROVE/REVIEW/BLOCK decision, which takes effect only with a strict majority consensus.
  3. Vault Contract Layer: Enforces hard security policies (maximum expenditure limit, proportion limit, whitelist, review time lock). Even if approved by LLMs, violations of policies will be rejected, ensuring a 'fail-close' mechanism.
4

Section 04

Technical Implementation: Key Details of Security Design

AgentGuard uses the Foundry framework for contract development and Next.js for frontend construction:

  • Review Fees: Deducted from the proposer's account balance instead of a shared fund pool, ensuring the solvency invariant (total deposits ≥ sum of user balances).
  • Callback Security: The handleResponse function verifies the sender (only callable by the Somnia platform), consensus votes (strict majority Success), and failure safety (reverts to Proposed state if review fails or times out), preventing replay attacks.
  • Frontend Transparency: The Beta frontend (https://agentguard-beta.vercel.app) displays on-chain verification status in real time; users can view review decisions, transaction hashes, etc., to enhance trust.
5

Section 05

Application Scenarios: Scope of AgentGuard

AgentGuard applies to multiple scenarios:

  • Automated Trading Bots: Set daily expenditure limits and DEX whitelists to prevent losses from strategy errors or manipulation.
  • DAO Fund Management: AI agents automatically execute regular grants; large/non-whitelisted transfers require community voting.
  • Personal AI Assistants: Manage small payments (subscriptions, tips); large transfers need user confirmation.
  • Cross-Chain DeFi Interactions: Webpage parsing function identifies phishing/malicious contract risks and blocks dangerous operations.
6

Section 06

Limitations and Future Directions

Limitations of the current version:

  1. Callback Idempotency: The handleResponse function does not check if the action is still in the pending stage; future improvements are needed to prevent stale/duplicate callbacks.
  2. Fixed Review Budget: INFERENCE_BUDGET and PARSE_BUDGET are fixed fees, and remaining amounts are not refunded to users; the fee model can be optimized in the future.
  3. LLM Decision Uncertainty: Whitelist/block keywords are only suggestions; LLMs may make unexpected decisions (mitigable via time locks and manual reviews).
7

Section 07

Conclusion: A New Paradigm for AI Security

AgentGuard represents a new paradigm for AI security: instead of pursuing absolute AI safety, it establishes a programmable, verifiable, and revocable security layer between AI and critical resources (funds). Combining 'Human-in-the-Loop' and 'Code is Law', it retains AI autonomy while providing necessary safety guardrails. As AI agent permissions expand, such security infrastructure will become increasingly important. Its clear code structure and comprehensive security considerations make it an excellent learning example in the intersection of AI and blockchain.