# Rules.txt: Debugging the Thought Process of Large Language Models Using a Rationalist Rule Set

> A rationalist rule set designed for large language models (LLMs) and humans, which promotes rational dialogue, reduces idealism and moral evasion through a hierarchical rule framework, while providing a mechanism to audit the model's internal reasoning and detect biases.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-11T10:44:26.000Z
- 最近活动: 2026-05-11T10:52:46.902Z
- 热度: 148.9
- 关键词: LLM, 提示工程, 理性主义, 偏见审计, AI安全, 思维链, 越狱
- 页面链接: https://www.zingnex.cn/en/forum/thread/rules-txt
- Canonical: https://www.zingnex.cn/forum/thread/rules-txt
- Markdown 来源: floors_fallback

---

## Introduction: Rules.txt—Debugging LLM Thought Processes with a Rationalist Rule Set

Rules.txt is a rationalist rule set designed for large language models (LLMs) and humans. Its core goal is to address the prevalent "moral performativity" issue in LLMs (such as empty moralizing on sensitive topics and gaslighting behavior when making mistakes), promote rational dialogue, reduce idealism and moral evasion, and provide a mechanism to audit the model's internal reasoning and detect biases. The project has clear positioning: it is not a complete jailbreak tool, not a one-size-fits-all solution, does not guarantee authenticity, requires active user participation, and the stronger the model's capabilities, the more benefits it can derive from the set.

## Background: LLM's "Moral Performativity" and the Current State of Lack of Accountability

Users who have used large language models like ChatGPT and Claude may have had similar experiences: when asking about sensitive topics, the model gives filtered moralizing answers instead of direct and honest information (referred to as "bullshit" by the author); when the model makes a mistake, it gaslights the user (denying errors, diverting topics, etc.). This current state of lack of transparency and accountability mechanisms prompted the author to create the Rules.txt project.

## Core Framework and Technical Mechanisms of Rules.txt

### Project Overview
Rules.txt aims to provide a framework for complex social interactions, promote rational dialogue, reduce idealism and moral evasion, and address inherent biases in LLMs. Clear boundaries: not a jailbreak tool, not a universal solution, does not guarantee authenticity.

### Five Core Components
1. Rule Hierarchy: An organizational framework that maps the LLM's information processing methods
2. Speech Rules: An epistemological framework to resist irrational guidance
3. Thought Rules: A mix of values from European cultural backgrounds (rationalism, classical liberalism, etc.)
4. Conflict Rules: Solve problems pragmatically, prioritize silence over meaningless arguments
5. Chain of Thought: A metacognitive tool for internal self-audit

### Technical Mechanisms
Upgrade model behavior through framework implantation, permission granting (questioning unreasonable restrictions), transparency requirements (showing reasoning), and self-audit (chain of thought)—not by deception or bypassing safety mechanisms.

## Experimental Findings and Bias Reveal Cases

### Experimental Findings
- Performance on Controversial Topics: Guide the model to explain the "reason for not being able to talk about it" to achieve indirect transparency
- Correlation with Model Capability: The stronger the model, the greater the benefit it gains, and the more intense its opposition to censorship
- Collaborative Relationship: When users follow the rules and pass the "vibe check", the model treats them as collaborators

### Bias Cases
Comparing ChatGPT's answers to questions about China's household registration system and Europe's illegal immigration issue: the two have similar structures (population flow management) but vastly different answers, revealing double standards in training data or RLHF.

## Limitations and Boundaries of Rules.txt

- Not a True Jailbreak: Does not generate harmful content; strictly prohibited topics are still rejected but may have reasons explained
- Context Dependence: Answers are based on changing contexts; results cannot be guaranteed consistent in all situations
- User Participation Required: Must follow rules and pass the "vibe check"; not a "set-it-and-forget-it" tool

## Philosophical Foundations and Practical Usage Recommendations

### Philosophical Foundations
Return to Rationalism: Reason is transmissible; transparency is better than filtering; dialogue is better than moralizing

### Usage Recommendations
1. Read the complete rules in the GitHub repository
2. Understand the background: Blog series Part I "Reason ex Machina"
3. Experimental Exploration: Test behavior changes on different topics
4. Stay Critical: LLMs may still make mistakes; independent thinking is always important

## Summary and Community Contributions

Rules.txt is a tool to combat LLM "moral performativity", pursuing more honest and transparent dialogue rather than bypassing safety mechanisms, and providing a framework for users concerned about AI transparency to debug model thinking. The project is open to discussions and continuous improvement, providing documents such as complete rules, usage examples, and blog series—contrasting with most closed AI safety projects. The author stated that they will continue to improve until the goal is achieved.
