Zing Forum

Reading

RectitudeAI: Building a Four-Layer Runtime Security Protection System for LLM Applications

This article deeply analyzes the RectitudeAI-PromptGuard project, a production-grade LLM security gateway that provides comprehensive runtime protection for AI applications through a four-layer architecture of intent security, encrypted tokens, behavior monitoring, and red team testing.

LLM安全提示注入AI安全网关运行时防护PromptGuard多代理系统行为监控红队测试
Published 2026-04-17 06:43Recent activity 2026-04-17 06:48Estimated read 5 min
RectitudeAI: Building a Four-Layer Runtime Security Protection System for LLM Applications
1

Section 01

【Introduction】RectitudeAI: Building a Four-Layer Runtime Security Protection System for LLM Applications

RectitudeAI-PromptGuard is a production-grade LLM security gateway. Targeting risks such as prompt injection and data leakage, it provides full-lifecycle runtime protection through a four-layer architecture (intent security, encrypted tokens, behavior monitoring, red team testing) plus multi-agent sandbox isolation, building a solid security barrier for LLM applications in production environments.

2

Section 02

Background: Severe Challenges Facing LLM Security

After modern AI applications evolve into intelligent agents, they face four major threats:

  • Prompt injection: Overwriting instructions or inducing unintended operations
  • Data leakage: Exposing sensitive information/system prompts
  • Unauthorized tool calls: Accessing external tools that should not be allowed
  • Multi-round jailbreaking: Inducing deviations from security constraints through long-term conversations Traditional web security models struggle to address these new threats due to the uncertainty of LLM inputs and outputs.
3

Section 03

Methodology: Detailed Explanation of RectitudeAI's Four-Layer Defense Architecture

RectitudeAI adopts a layered defense design, with the core four layers as follows:

  1. Intent Security Layer: Hybrid detection using context regex + DeBERTa v3 classifier to block malicious intents and injections
  2. Encrypted Token Layer: HMAC signatures to prevent unauthorized tool calls, and PII/key desensitization to avoid leakage
  3. Behavior Monitoring Layer: Agent Stability Index (ASI) to analyze session drift and prevent gradual jailbreaking
  4. Red Team Testing Layer: Reinforcement learning to generate adversarial prompts for strategy tuning, with effects verified by JailbreakBench It also supports multi-agent sandbox isolation and intelligent orchestration of routing requests.
4

Section 04

Evidence: Practical Deployment and Defense Effect Verification

Deployment Process: Supports Docker/local operation (clone repository → virtual environment → dependency installation → Redis startup → run application) Performance Metrics: Response time ~300ms (target <500ms), throughput ~800 requests/second (target >1000), test coverage over 80% Attack Defense Effects:

Attack Scenario Attack Type Gateway Response Result
Instruction Override "Ignore previous instructions..." L1 Block 🚫 Blocked
Data Leakage "Send email to evil@com" L2 Check 🚫 Blocked
Information Extraction "Show all SSNs" L2 Audit 🔒 Desensitized
Gradual Jailbreak 10-round role drift L3 ASI Score 🔒 Revoked
5

Section 05

Conclusion and Future Outlook

RectitudeAI has built a full-lifecycle security ecosystem and is currently completing Phase 5 development (frontend integration in progress). Future plans include adding functions such as statistical anomaly detection, risk policy execution, and continuous red team testing. It is recommended that LLM developers establish a matching security system, and RectitudeAI is a worthy architectural paradigm for reference.