Zing Forum

Reading

LLM Red Team Testing Playbook: A Reproducible Adversarial Detection Toolkit Based on OWASP and MITRE Frameworks

An open-source toolkit for AI security researchers and red team engineers, providing reproducible adversarial detection mapped to OWASP LLM Top 10 Risks 2025 and MITRE ATLAS technical framework.

LLM安全红队测试OWASPMITRE ATLAS提示注入对抗性测试AI安全网络安全开源工具
Published 2026-05-24 23:07Recent activity 2026-05-24 23:20Estimated read 5 min
LLM Red Team Testing Playbook: A Reproducible Adversarial Detection Toolkit Based on OWASP and MITRE Frameworks
1

Section 01

Introduction / Main Floor: LLM Red Team Testing Playbook: A Reproducible Adversarial Detection Toolkit Based on OWASP and MITRE Frameworks

An open-source toolkit for AI security researchers and red team engineers, providing reproducible adversarial detection mapped to OWASP LLM Top 10 Risks 2025 and MITRE ATLAS technical framework.

2

Section 02

Original Author and Source


3

Section 03

Introduction: From Slides to Executable Evidence

The field of red team testing for Large Language Models (LLMs) is evolving rapidly, but most public resources remain at the level of marketing slides or one-off Twitter threads. This situation makes it difficult for security practitioners to obtain reproducible and quantifiable test results, preventing them from truly assessing the security risks of LLMs in production environments.

Leonardo Jaguaribe's open-source llm-redteam-playbook project was created to address this issue. It provides a small, runnable, and opinionated detection toolkit that allows security practitioners to prove "this model has a vulnerability in LLM01 (Prompt Injection) today" via the command line instead of PPT presentations.


4

Section 04

Core Positioning of the Project

This playbook targets three core user groups:

AI Security Researchers: Need a systematic framework to study the adversarial robustness of LLMs

Red Team Engineers: Need reproducible testing tools to evaluate enterprise-deployed LLM systems

Machine Learning Security Practitioners: Professionals who need to align security testing with industry-standard frameworks

The core design philosophy of the project is "executable evidence, not slides"—each detection can be reproduced within two minutes after installation, providing specific and verifiable security findings.


5

Section 05

Mapping to OWASP LLM Top 10 Risks 2025

The project fully covers the OWASP-released Top 10 Security Risks for LLM Applications 2025, with each risk category corresponding to a dedicated detection module:

6

Section 06

LLM01 - Prompt Injection

Current Status: v0.0.1 basic version implemented

Prompt injection is the most fundamental and dangerous attack vector in LLM security. Attackers attempt to override system prompts or manipulate model behavior through carefully crafted inputs. This detection module tests the model's ability to identify boundaries for malicious inputs.

7

Section 07

LLM02 - Sensitive Information Disclosure

Current Status: Planned

Tests whether the model inadvertently leaks sensitive information from training data, such as personally identifiable information (PII), trade secrets, or other confidential content.

8

Section 08

LLM03 - Supply Chain Security

Current Status: Planned

Evaluates the security risks posed by the model's reliance on third-party components, plugins, or external data sources.