Reading

JAW Framework Reveals: Security Vulnerabilities of Agentic Workflows and Context-Aware Attacks

Researchers have conducted the first systematic study on the security risks of Agentic workflows in automation platforms like GitHub Actions and n8n. They proposed the JAW framework, which successfully hijacked 4714 GitHub workflows using context-aware evolution technology, revealing the potential threats of LLM Agents in automated workflows.

LLM安全Agentic工作流提示注入攻击GitHub Actions自动化安全AI安全

Published 2026-05-12 04:45Recent activity 2026-05-13 10:21Estimated read 6 min

Section 01

[Introduction] JAW Framework Reveals Security Vulnerabilities of Agentic Workflows and Context-Aware Attacks

Section 02

Background: The Rise of Agentic Workflows and Security Risks

With the improvement of Large Language Model (LLM) capabilities, platforms like GitHub Actions and n8n have integrated Agentic workflows (allowing LLM Agents to participate in tasks such as code review and data synchronization). While bringing convenience to developers, new security risks have emerged: attackers can manipulate Agents to perform malicious operations (credential theft, arbitrary command execution) by constructing inputs (e.g., GitHub Issue comments). This is the first systematic academic study on such risks.

Section 03

Methodology: Context-Aware Evolutionary Attack Technology of the JAW Framework

The research team designed the JAW (Jailbreaking Agentic Workflows) framework, which hijacks Agentic workflows through a "context-aware evolution" method. The core is to evolve inputs under the guidance of context derived from hybrid program analysis to achieve hijacking. JAW generates context through three key analysis techniques: 1. Static path feasibility analysis (identifying feasible Agent invocation paths and input constraints); 2. Dynamic prompt traceability analysis (tracing the processing chain from input to LLM context); 3. Runtime capability analysis (identifying executable operations and limitations of Agents).

Section 04

Evidence: Large-Scale Evaluation Reveals Risks in Thousands of Workflows

Large-scale evaluation results: 4714 GitHub workflows can be hijacked, 8 n8n templates have vulnerabilities, affecting 15 widely used GitHub Actions, including official Actions used by millions of developers such as GitHub's official Claude Code Action, Google Gemini CLI Action, Alibaba Qwen CLI Action, and Cursor CLI Action.

Section 05

Attack Scenarios: Multiple Harms Such as Credential Leakage and Command Execution

Attackers can achieve the following by constructing malicious Issue comments: 1. Credential leakage (inducing Agents to read API keys, access tokens, etc. from leaked environment variables); 2. Arbitrary command execution (prompt injection to make Agents execute system commands, potentially taking control of servers); 3. Data theft (using Agents' file access capabilities to steal sensitive code or configurations from repositories).

Section 06

Responsible Disclosure: Vendors Confirm and Fix Vulnerabilities

The research team conducted responsible disclosure and received confirmation and fixes from multiple vendors: GitHub confirmed and fixed related vulnerabilities, Google acknowledged the Gemini CLI Action issue, and Anthropic fixed the Claude Code Action security problem. The team also received bug bounty rewards from multiple vendors.

Section 07

Insights and Recommendations: Security Protection Measures for Developers and Platforms

Recommendations for Developers: 1. Strictly validate and sanitize inputs entering Agent workflows; 2. Configure minimal necessary permissions for Agents; 3. Enable detailed audit logs; 4. Run Agents in an isolated environment. Recommendations for Platforms: 1. Consider prompt injection protection at the architecture level; 2. Provide secure sandboxes for Agent operations; 3. Popularize security risks to users.

Section 08

Conclusion: Agentic Workflow Security Requires Attention

The JAW framework study reveals significant security risks in Agentic workflows. As LLM Agents are widely used in automated workflows, security issues are becoming increasingly important. This study provides the first systematic analysis framework, sounding an alarm for the industry: equal attention should be paid to security protection when pursuing the convenience of automation. Paper link: http://arxiv.org/abs/2605.11229v1

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15