Reading

Mushroom Kingdom AI Firewall: Gamifying LLM Red Team Security Testing with Pixel Style

LLM安全红队测试提示注入越狱攻击RAG安全OWASPFastAPIReactAI应用安全渗透测试

Published 2026-06-15 08:25Recent activity 2026-06-15 08:56Estimated read 6 min

Section 01

Introduction: Mushroom Kingdom AI Firewall — A Gamified LLM Red Team Security Testing Platform

Mushroom Kingdom AI Firewall is a Mario-inspired LLM application security testing platform. Built with React+TypeScript frontend and FastAPI backend, it provides automated red team testing capabilities including prompt injection, jailbreak attacks, data leakage, tool abuse, and RAG poisoning, all mapped to the OWASP LLM Top 10 security framework. Maintained by realshawnnnn, the source code is hosted on GitHub, aiming to help teams systematically assess the security posture of LLM applications.

Section 02

Background: Security Challenges and Needs of LLM Applications

With the popularity of LLMs like ChatGPT and Claude, enterprises integrating LLMs into applications face various security risks, including prompt injection, jailbreak attacks, data leakage, tool abuse, and RAG poisoning. OWASP released the LLM Top 10 security risk list in 2023, but many teams lack systematic testing methods. This platform was created to address this issue, encapsulating professional testing into user-friendly and reproducible tools.

Section 03

Design Philosophy and Technical Architecture

Design Philosophy: Adopts Mario-style pixel art (original materials to avoid copyright issues), using gamified elements like castle security maps, Bowser attack simulators, and Princess protection reports to lower the barrier to security testing. Technical Architecture:

Frontend: React+TypeScript, including pages like security posture dashboard and attack simulator;
Backend: FastAPI (high performance, automatic API documentation), with components like attack modules, evaluators, and risk scorers;
Data Layer: Default SQLite (zero configuration), PostgreSQL available for production;
Deployment: Dockerized, supporting one-click startup of the service stack.

Section 04

Core Features: Modular Attack and Evaluation

Attack Modules: Covers 5 types of LLM attacks (prompt injection, jailbreak, data leakage, tool abuse, RAG poisoning), each encapsulated as an independent class; Evaluators: SecretDetector (sensitive information detection), PolicyViolationDetector (policy violation detection), PromptInjectionSuccessDetector (injection success detection), RiskScorer (risk scoring); OWASP Mapping: Automatically maps identified issues to the OWASP LLM Top 10 framework, facilitating alignment with industry standards.

Section 05

Usage Scenarios: From Demo to Production Deployment

Local Demo: Supports Mock LLM mode, allowing operation without an API key (frontend: npm run dev, backend: uvicorn startup); Real LLM Integration: Configure OpenAI-compatible API endpoints (set environment variables like LLM_MODE, API_KEY, etc.); Docker Deployment: One-click startup of the complete service stack (frontend, backend, database) via docker compose up --build.

Section 06

Project Highlights and Limitations

Highlights: Comprehensive red team testing coverage, gamified user experience, OWASP standardized mapping, reproducible testing, flexible LLM support, local Mock mode; Limitations: Attack techniques need continuous updates to address threat evolution, automatic evaluation may have misjudgments (manual verification required), some OWASP risks are not fully covered, testing requires explicit authorization (to avoid illegality).

Section 07

Conclusion: An Important Tool for LLM Security Testing

Mushroom Kingdom AI Firewall provides a friendly and complete starting point for LLM application security testing, helping teams quickly identify common vulnerabilities while conveying the concept that 'LLM security should be embedded in the entire development process'. For development teams, it is a low-cost security assessment tool; for researchers, it is an extensible experimental framework. As LLM applications become more popular, such tools will help bridge the gap between innovation and security.

Mushroom Kingdom AI Firewall: Gamifying LLM Red Team Security Testing with Pixel Style

Introduction: Mushroom Kingdom AI Firewall — A Gamified LLM Red Team Security Testing Platform

Background: Security Challenges and Needs of LLM Applications

Design Philosophy and Technical Architecture

Core Features: Modular Attack and Evaluation

Usage Scenarios: From Demo to Production Deployment

Project Highlights and Limitations

Conclusion: An Important Tool for LLM Security Testing

Continue Reading

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

Graph Neural Networks Revolutionize Global Weather Forecasting: From Graph Weather to Open-Source Practice of Multi-Model Fusion

ExoVision: AI-Driven Exoplanet Detection and Habitability Assessment Platform

Vertica Expert Skills: A One-Stop Guide to Enterprise Database Migration and Optimization