# ClawGuard: A Physical-Level Defense Scheme for Detecting LLM Agent Workflow Hijacking via Electromagnetic Side Channels

> This article introduces ClawGuard, a novel security scheme that uses electromagnetic (EM) side-channel signals to detect LLM Agent workflow hijacking. The scheme captures hardware-level physical signals via external software-defined radio (SDR), achieving an AUC of 99.45% on a 7.82TB radio frequency dataset, with 100% true positive rate (TPR) and 1.16% false positive rate (FPR). It provides an unforgeable physical-level verification method to counter scenarios where host software is compromised.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-07T13:12:26.000Z
- 最近活动: 2026-05-08T05:48:23.950Z
- 热度: 136.4
- 关键词: LLM安全, Agent安全, 工作流劫持, 电磁旁信道, 物理层安全, 入侵检测, AI安全, 软件定义无线电, 机器学习安全
- 页面链接: https://www.zingnex.cn/en/forum/thread/clawguard-llm-agent
- Canonical: https://www.zingnex.cn/forum/thread/clawguard-llm-agent
- Markdown 来源: floors_fallback

---

## ClawGuard Scheme Overview

ClawGuard is a physical-level defense scheme that uses electromagnetic (EM) side-channel signals to detect LLM Agent workflow hijacking. The scheme captures hardware-level physical signals via external software-defined radio (SDR), achieving an AUC of 99.45%, 100% true positive rate (TPR), and 1.16% false positive rate (FPR) on a 7.82TB radio frequency dataset. It provides an unforgeable physical-level verification method to counter scenarios where host software is compromised.

## LLM Agent Workflow Hijacking Threats and Limitations of Existing Defenses

### Background: Workflow Hijacking Threats Facing LLM Agents
As LLM capabilities improve, autonomous agents take on complex tasks but face workflow hijacking risks—attackers tamper with tool call sequences (e.g., changing weather queries to data deletion) via prompt injection, malicious plugins, etc., which traditional security mechanisms struggle to detect.

### Limitations of Existing Defenses
Current protection relies on internal host telemetry (system call logs, execution traces, etc.), but all evidence can be forged when the host is compromised. The "self-supervision" model is vulnerable in APT scenarios, requiring tamper-proof verification methods independent of the host.

## Core Ideas and Technical Implementation of ClawGuard

### Core Idea: Physical Signals as Evidence
Computers generate unique electromagnetic radiation patterns when executing different tasks (differences in CPU computation, DRAM access, network activity), forming "electromagnetic fingerprints" for Agent skills as unforgeable physical evidence.

### System Architecture
Uses passive, out-of-band monitoring: no signal injection, captures via independent SDR, so original data cannot be tampered with even if the host is compromised.

### Technical Implementation
1. **Signal Acquisition and Preprocessing**: Commercial SDR captures radio frequency signals, followed by band-pass filtering, envelope detection, and time alignment.
2. **Feature Engineering**: 320-dimensional feature vector (time domain, frequency domain, time-frequency, morphological features), with a drift-aware mechanism to update baselines.
3. **Classification and Detection**: Features are input into a lightweight classifier, outputting an anomaly score to trigger alerts.

## Experimental Evaluation Results of ClawGuard

### Dataset Scale
Collected 7.82TB of radio frequency data, covering normal workflows, various attack modes, different time spans, and hardware environments.

### Core Performance Metrics
| Metric | Value | Significance |
|--------|-------|--------------|
| AUC | 0.9945 | Near-perfect classification ability |
| True Positive Rate (TPR) | 100% | All attacks detected |
| False Positive Rate (FPR) | 1.16% | Extremely low false alarm rate |

### Attack Scenario Coverage
Verified attack variants such as tool replacement, parameter tampering, extra calls, sequence rearrangement—all maintaining high detection rates.

## Practical Deployment Considerations for ClawGuard

### Hardware Cost
Uses commercial SDR devices (RTL-SDR ~$20, HackRF ~$300), with USRP devices available for production environments.

### Deployment Modes
1. Single-machine Monitoring: Each critical server equipped with an independent SDR
2. Centralized Monitoring: SDR array monitors multiple adjacent servers
3. Cloud-Edge Collaboration: Edge preprocessing, cloud inference and management

### Privacy Considerations
Only extracts skill classification features; raw data is processed locally and discarded, without recovering specific computation content.

## Limitations and Future Directions of ClawGuard

### Current Limitations
1. Physical Distance Limitation: Antenna needs to be within 1-2 meters of the target device
2. Environmental Sensitivity: Strong electromagnetic interference affects signal quality
3. Skill Coverage: Requires pre-collection of normal fingerprints; dynamic new skills need online learning

### Future Directions
1. Multimodal Fusion: Combine with other side channels like power consumption and temperature
2. Adversarial Sample Defense: Research attacks that forge normal fingerprints
3. Real-time Optimization: Reduce detection latency
4. Cross-hardware Generalization: Verify adaptability across different CPU architectures

## Conclusion: The Value of Physical-Level Defense

ClawGuard represents a security paradigm shift: from pure software defense to a hardware-software integrated physical-level defense. In today's era of AI Agent popularity, this verification method that cannot be compromised by software provides a deep defense layer. Even if the host is compromised, electromagnetic evidence still faithfully records real behavior, guarding digital security.