Zing Forum

Reading

ClawGuard: A Physical-Level Defense Scheme for Detecting LLM Agent Workflow Hijacking via Electromagnetic Side Channels

This article introduces ClawGuard, a novel security scheme that uses electromagnetic (EM) side-channel signals to detect LLM Agent workflow hijacking. The scheme captures hardware-level physical signals via external software-defined radio (SDR), achieving an AUC of 99.45% on a 7.82TB radio frequency dataset, with 100% true positive rate (TPR) and 1.16% false positive rate (FPR). It provides an unforgeable physical-level verification method to counter scenarios where host software is compromised.

LLM安全Agent安全工作流劫持电磁旁信道物理层安全入侵检测AI安全软件定义无线电机器学习安全
Published 2026-05-07 21:12Recent activity 2026-05-08 13:48Estimated read 8 min
ClawGuard: A Physical-Level Defense Scheme for Detecting LLM Agent Workflow Hijacking via Electromagnetic Side Channels
1

Section 01

ClawGuard Scheme Overview

ClawGuard is a physical-level defense scheme that uses electromagnetic (EM) side-channel signals to detect LLM Agent workflow hijacking. The scheme captures hardware-level physical signals via external software-defined radio (SDR), achieving an AUC of 99.45%, 100% true positive rate (TPR), and 1.16% false positive rate (FPR) on a 7.82TB radio frequency dataset. It provides an unforgeable physical-level verification method to counter scenarios where host software is compromised.

2

Section 02

LLM Agent Workflow Hijacking Threats and Limitations of Existing Defenses

Background: Workflow Hijacking Threats Facing LLM Agents

As LLM capabilities improve, autonomous agents take on complex tasks but face workflow hijacking risks—attackers tamper with tool call sequences (e.g., changing weather queries to data deletion) via prompt injection, malicious plugins, etc., which traditional security mechanisms struggle to detect.

Limitations of Existing Defenses

Current protection relies on internal host telemetry (system call logs, execution traces, etc.), but all evidence can be forged when the host is compromised. The "self-supervision" model is vulnerable in APT scenarios, requiring tamper-proof verification methods independent of the host.

3

Section 03

Core Ideas and Technical Implementation of ClawGuard

Core Idea: Physical Signals as Evidence

Computers generate unique electromagnetic radiation patterns when executing different tasks (differences in CPU computation, DRAM access, network activity), forming "electromagnetic fingerprints" for Agent skills as unforgeable physical evidence.

System Architecture

Uses passive, out-of-band monitoring: no signal injection, captures via independent SDR, so original data cannot be tampered with even if the host is compromised.

Technical Implementation

  1. Signal Acquisition and Preprocessing: Commercial SDR captures radio frequency signals, followed by band-pass filtering, envelope detection, and time alignment.
  2. Feature Engineering: 320-dimensional feature vector (time domain, frequency domain, time-frequency, morphological features), with a drift-aware mechanism to update baselines.
  3. Classification and Detection: Features are input into a lightweight classifier, outputting an anomaly score to trigger alerts.
4

Section 04

Experimental Evaluation Results of ClawGuard

Dataset Scale

Collected 7.82TB of radio frequency data, covering normal workflows, various attack modes, different time spans, and hardware environments.

Core Performance Metrics

Metric Value Significance
AUC 0.9945 Near-perfect classification ability
True Positive Rate (TPR) 100% All attacks detected
False Positive Rate (FPR) 1.16% Extremely low false alarm rate

Attack Scenario Coverage

Verified attack variants such as tool replacement, parameter tampering, extra calls, sequence rearrangement—all maintaining high detection rates.

5

Section 05

Practical Deployment Considerations for ClawGuard

Hardware Cost

Uses commercial SDR devices (RTL-SDR ~$20, HackRF ~$300), with USRP devices available for production environments.

Deployment Modes

  1. Single-machine Monitoring: Each critical server equipped with an independent SDR
  2. Centralized Monitoring: SDR array monitors multiple adjacent servers
  3. Cloud-Edge Collaboration: Edge preprocessing, cloud inference and management

Privacy Considerations

Only extracts skill classification features; raw data is processed locally and discarded, without recovering specific computation content.

6

Section 06

Limitations and Future Directions of ClawGuard

Current Limitations

  1. Physical Distance Limitation: Antenna needs to be within 1-2 meters of the target device
  2. Environmental Sensitivity: Strong electromagnetic interference affects signal quality
  3. Skill Coverage: Requires pre-collection of normal fingerprints; dynamic new skills need online learning

Future Directions

  1. Multimodal Fusion: Combine with other side channels like power consumption and temperature
  2. Adversarial Sample Defense: Research attacks that forge normal fingerprints
  3. Real-time Optimization: Reduce detection latency
  4. Cross-hardware Generalization: Verify adaptability across different CPU architectures
7

Section 07

Conclusion: The Value of Physical-Level Defense

ClawGuard represents a security paradigm shift: from pure software defense to a hardware-software integrated physical-level defense. In today's era of AI Agent popularity, this verification method that cannot be compromised by software provides a deep defense layer. Even if the host is compromised, electromagnetic evidence still faithfully records real behavior, guarding digital security.