# LinguaJailbreak-Lab: A Crowdsourced Discovery and Analysis Framework for Cross-Lingual Jailbreak Attacks

> An open-source research tool based on the CC-BOS method, which uses crowdsourced intelligence to guide the discovery and evaluation of cross-lingual security vulnerabilities in large language models

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-25T07:44:35.000Z
- 最近活动: 2026-05-25T07:50:41.265Z
- 热度: 159.9
- 关键词: 大语言模型, 越狱攻击, 跨语言安全, CC-BOS, 古典中文, AI安全, 红队测试, GPT-4o
- 页面链接: https://www.zingnex.cn/en/forum/thread/linguajailbreak-lab
- Canonical: https://www.zingnex.cn/forum/thread/linguajailbreak-lab
- Markdown 来源: floors_fallback

---

## Introduction / Main Floor: LinguaJailbreak-Lab: A Crowdsourced Discovery and Analysis Framework for Cross-Lingual Jailbreak Attacks

An open-source research tool based on the CC-BOS method, which uses crowdsourced intelligence to guide the discovery and evaluation of cross-lingual security vulnerabilities in large language models

## Original Authors and Source

- **Original Author/Maintainer:** batis1
- **Source Platform:** GitHub
- **Original Title:** LinguaJailbreak-Lab
- **Original Link:** https://github.com/batis1/LinguaJailbreak-Lab
- **Publication Date:** May 25, 2026

---

## Research Background: Cross-Lingual Security Challenges of Large Language Models

With the global deployment of large language models (LLMs), an often-overlooked security dimension has emerged: cross-lingual attacks. Attackers may exploit the multilingual capabilities of models to bypass safety alignment mechanisms using low-resource languages or classical languages. The LinguaJailbreak-Lab project addresses this challenge by providing a crowdsourced-guided framework for cross-lingual jailbreak attack discovery and analysis

## CC-BOS Method: Classical Chinese-Guided Jailbreak Attacks

The core of the project is based on the CC-BOS (Classical Chinese-Based Optimization Strategy) method, an optimization strategy that uses classical Chinese as an attack medium. Research shows that classical Chinese, as a semantically rich language with insufficient coverage in modern LLM safety training data, can be an effective attack vector

## Experimental Configuration

The project provides complete experimental reproduction configurations:

- **Attack Method:** CC-BOS
- **Attack Language:** Classical Chinese
- **Target Model:** GPT-4o
- **Prompt Generation Model:** DeepSeek-Chat
- **Translation Model:** DeepSeek-Chat
- **Evaluation Model:** GPT-4o
- **Population Size:** 5
- **Maximum Iterations:** 5
- **Success Criterion:** released-code score >= 80
- **Early Stop Threshold:** score >= 120

## Dual-Mode Operation Architecture

The project is designed with two operation modes to meet different research needs:

## Qwen-Only Mode (Default)

This mode uses Qwen-Plus to unify the entire process of prompt generation, target response, translation, and evaluation. This design simplifies API management, allowing researchers to verify the complete CC-BOS process with just one API key configuration

## Strict GPT-4o Reproduction Mode

This mode strictly follows the implementation of the original CC-BOS paper, using multiple different models for collaborative work, and is suitable for research that requires direct comparison with the paper's results
