Reading

ClawGuard: A Physical-Level Defense Scheme for Detecting LLM Agent Workflow Hijacking via Electromagnetic Side Channels

This article introduces ClawGuard, a novel security scheme that uses electromagnetic (EM) side-channel signals to detect LLM Agent workflow hijacking. The scheme captures hardware-level physical signals via external software-defined radio (SDR), achieving an AUC of 99.45% on a 7.82TB radio frequency dataset, with 100% true positive rate (TPR) and 1.16% false positive rate (FPR). It provides an unforgeable physical-level verification method to counter scenarios where host software is compromised.

LLM安全Agent安全工作流劫持电磁旁信道物理层安全入侵检测AI安全软件定义无线电机器学习安全

Published 2026-05-07 21:12Recent activity 2026-05-08 13:48Estimated read 8 min

ClawGuard: A Physical-Level Defense Scheme for Detecting LLM Agent Workflow Hijacking via Electromagnetic Side Channels

Section 01

ClawGuard Scheme Overview

ClawGuard is a physical-level defense scheme that uses electromagnetic (EM) side-channel signals to detect LLM Agent workflow hijacking. The scheme captures hardware-level physical signals via external software-defined radio (SDR), achieving an AUC of 99.45%, 100% true positive rate (TPR), and 1.16% false positive rate (FPR) on a 7.82TB radio frequency dataset. It provides an unforgeable physical-level verification method to counter scenarios where host software is compromised.

Section 02

LLM Agent Workflow Hijacking Threats and Limitations of Existing Defenses

Background: Workflow Hijacking Threats Facing LLM Agents

As LLM capabilities improve, autonomous agents take on complex tasks but face workflow hijacking risks—attackers tamper with tool call sequences (e.g., changing weather queries to data deletion) via prompt injection, malicious plugins, etc., which traditional security mechanisms struggle to detect.

Limitations of Existing Defenses

Current protection relies on internal host telemetry (system call logs, execution traces, etc.), but all evidence can be forged when the host is compromised. The "self-supervision" model is vulnerable in APT scenarios, requiring tamper-proof verification methods independent of the host.

Section 03

Core Ideas and Technical Implementation of ClawGuard

Core Idea: Physical Signals as Evidence

Computers generate unique electromagnetic radiation patterns when executing different tasks (differences in CPU computation, DRAM access, network activity), forming "electromagnetic fingerprints" for Agent skills as unforgeable physical evidence.

System Architecture

Uses passive, out-of-band monitoring: no signal injection, captures via independent SDR, so original data cannot be tampered with even if the host is compromised.

Technical Implementation

Signal Acquisition and Preprocessing: Commercial SDR captures radio frequency signals, followed by band-pass filtering, envelope detection, and time alignment.
Feature Engineering: 320-dimensional feature vector (time domain, frequency domain, time-frequency, morphological features), with a drift-aware mechanism to update baselines.
Classification and Detection: Features are input into a lightweight classifier, outputting an anomaly score to trigger alerts.

Section 04

Experimental Evaluation Results of ClawGuard

Dataset Scale

Collected 7.82TB of radio frequency data, covering normal workflows, various attack modes, different time spans, and hardware environments.

Core Performance Metrics

Metric	Value	Significance
AUC	0.9945	Near-perfect classification ability
True Positive Rate (TPR)	100%	All attacks detected
False Positive Rate (FPR)	1.16%	Extremely low false alarm rate

Attack Scenario Coverage

Verified attack variants such as tool replacement, parameter tampering, extra calls, sequence rearrangement—all maintaining high detection rates.

Section 05

Practical Deployment Considerations for ClawGuard

Hardware Cost

Uses commercial SDR devices (RTL-SDR ~$20, HackRF ~$300), with USRP devices available for production environments.

Deployment Modes

Single-machine Monitoring: Each critical server equipped with an independent SDR
Centralized Monitoring: SDR array monitors multiple adjacent servers
Cloud-Edge Collaboration: Edge preprocessing, cloud inference and management

Privacy Considerations

Only extracts skill classification features; raw data is processed locally and discarded, without recovering specific computation content.

Section 06

Limitations and Future Directions of ClawGuard

Current Limitations

Physical Distance Limitation: Antenna needs to be within 1-2 meters of the target device
Environmental Sensitivity: Strong electromagnetic interference affects signal quality
Skill Coverage: Requires pre-collection of normal fingerprints; dynamic new skills need online learning

Future Directions

Multimodal Fusion: Combine with other side channels like power consumption and temperature
Adversarial Sample Defense: Research attacks that forge normal fingerprints
Real-time Optimization: Reduce detection latency
Cross-hardware Generalization: Verify adaptability across different CPU architectures

Section 07

Conclusion: The Value of Physical-Level Defense

ClawGuard represents a security paradigm shift: from pure software defense to a hardware-software integrated physical-level defense. In today's era of AI Agent popularity, this verification method that cannot be compromised by software provides a deep defense layer. Even if the host is compromised, electromagnetic evidence still faithfully records real behavior, guarding digital security.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15