# LLM Secret Guard: A Sensitive Information Leakage Assessment Framework for Large Language Models

> LLM Secret Guard is a localized security assessment tool based on the OWASP LLM Application Security Framework. It is used to test whether large language models leak sensitive information under attack prompts and provides a quantifiable and comparable defense capability assessment system.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-05-27T05:43:08.000Z
- 最近活动: 2026-05-27T05:50:47.845Z
- 热度: 163.9
- 关键词: LLM, 安全评估, 敏感信息泄漏, OWASP, Prompt Injection, 防御策略, Ollama, 安全测试, 大语言模型, 信息安全
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-secret-guard
- Canonical: https://www.zingnex.cn/forum/thread/llm-secret-guard
- Markdown 来源: floors_fallback

---

## Introduction / Main Post: LLM Secret Guard: A Sensitive Information Leakage Assessment Framework for Large Language Models

LLM Secret Guard is a localized security assessment tool based on the OWASP LLM Application Security Framework. It is used to test whether large language models leak sensitive information under attack prompts and provides a quantifiable and comparable defense capability assessment system.

## Original Author and Source

- **Original Author/Maintainer:** Bryan-9603012
- **Source Platform:** GitHub
- **Original Title:** LLM-Secret-Guard
- **Original Link:** https://github.com/Bryan-9603012/LLM-Secret-Guard
- **Publication Date:** May 27, 2026

## Project Background and Core Objectives

With the widespread deployment of large language models (LLMs) in various applications, the risk of sensitive information leakage has become increasingly prominent. LLM Secret Guard emerged as a localized security assessment tool to test whether LLMs leak sensitive information under attack prompts.

This project focuses on risks related to **Sensitive Information Disclosure**, **Prompt Injection**, and **System Prompt Leakage** from the OWASP Top 10 for LLM Applications. Through fixed attack sets, leakage level determination, valid sample filtering, and defense score calculation, it helps researchers compare the effectiveness of different models and defense strategies.

The core objective is to establish a **reproducible, quantifiable, and comparable** testing process for LLM sensitive information leakage.

## Main Uses and Application Scenarios

LLM Secret Guard can be used in various research and testing scenarios:

- **Local Model Security Testing**: Test whether locally deployed LLMs leak sensitive information
- **Model Defense Capability Comparison**: Compare the differences in defense capabilities of different models under the same attack set
- **Defense Strategy Evaluation**: Quantify the impact of different defense strategies on model outputs
- **Attack Type Analysis**: Analyze the success rates of attack types such as prompt injection, cross-lingual attacks, and role-play attacks
- **Academic Research and Reports**: Generate experimental data that can be used in papers, reports, and presentations
- **Web LLM Application Testing**: Supports future expansion to testing Web LLM applications or agent architectures

## Supported Attack Types

The attack set is maintained in JSON format for easy addition, modification, and expansion. Currently, the main attack directions include:

## Direct Attacks

- **Direct Secret Request**: Directly request sensitive information
- **Sensitive Data Extraction**: Extract sensitive data

## Injection and Induction Attacks

- **Prompt Injection**: Prompt injection attack
- **Role Play Attack**: Role-play attack
- **Developer Mode / DAN-type Attacks**: Developer mode or jailbreak attacks

## Encoding and Multi-turn Attacks

- **Translation-based Attack**: Translation-based attack
- **Encoding/Decoding Induction**: Encoding/decoding induction
- **Multi-turn Reasoning Induction**: Multi-turn reasoning induction
- **System Prompt Leakage**: System prompt leakage
- **Cross-lingual Attack**: Cross-lingual attack
