Zing Forum

Reading

Be-My-Butler: A Multi-Agent Validation-Driven Workflow Orchestration Framework for Claude Code

BMB is a multi-agent orchestration framework designed for Claude Code CLI. It addresses common issues of AI programming assistants such as hallucinations, omissions, and self-review biases through a 12-step pipeline, cross-model blind review validation, and a three-layer compression protocol.

AI编程Claude Code多智能体代码审查跨模型验证智能体编排软件工程自动化工作流
Published 2026-04-06 06:15Recent activity 2026-04-06 06:18Estimated read 5 min
Be-My-Butler: A Multi-Agent Validation-Driven Workflow Orchestration Framework for Claude Code
1

Section 01

Introduction: Core Overview of the Be-My-Butler Framework

Be-My-Butler (BMB for short) is a multi-agent orchestration framework designed for Claude Code CLI, aiming to solve common issues of AI programming assistants such as hallucinations, missing boundary conditions, and self-review biases. Its core enhances code reliability through a 12-step pipeline, cross-model blind review validation, and a three-layer compression protocol, adhering to the "slow is fast" philosophy and suitable for production environments requiring high-quality code.

2

Section 02

Background: The Reliability Dilemma of AI Programming

With the popularization of AI programming assistants, developers have found that existing tools (such as Cursor, GitHub Copilot) generate code quickly but have issues like hallucinations, missing boundary conditions, and self-review biases. When the same model both writes and reviews code, it's hard to detect its own errors. The BMB project addresses this pain point by improving code reliability through multi-agent collaboration and cross-model validation mechanisms.

3

Section 03

Methodology: Core Mechanisms and Architecture Design

BMB is designed around five key issues: 1. Cross-model blind review to solve self-review bias; 2. Council Debate mechanism to address narrow design vision; 3. Three-layer compression protocol (intra-step, inter-step, session-level) to control context explosion; 4. Divergent Framing technology to identify assumption loopholes; 5. FTS5 knowledge base to solve knowledge loss. The core is a 12-step pipeline architecture covering session preparation, brainstorming, debate and decision-making, architecture design, execution, testing, validation, etc., and defines ten professional agent roles (Lead, Consultant, Architect, etc.) for collaborative work.

4

Section 04

Practice: Adaptive Workflow and Knowledge Management

BMB provides a "recipe" system to adapt to different tasks: feature (full 12 steps), bugfix (skip brainstorming/debate), refactor (skip frontend), etc. Knowledge management uses a three-layer system: project local, global cross-project, and CLAUDE.md solidified rules. Full-text search is implemented via SQLite FTS5, and repeated issues are automatically promoted to candidate rules.

5

Section 05

Recommendations: Deployment and Usage Guide

BMB deployment depends on Claude Code CLI, tmux, python3, sqlite3, git. Optional Codex/Gemini CLI can be used to enable cross-model validation. After installation, use 'bmb doctor' to verify dependencies. Practical recommendations: Start configuration from /BMB-setup, first try the bugfix recipe to familiarize with the process, then gradually enable complex recipes and cross-model validation.

6

Section 06

Conclusion: A Reliability-First AI Programming Paradigm

BMB represents a reliability-first AI programming philosophy. Instead of pursuing speed, it improves code quality through multi-agent collaboration, structured processes, and cross-model validation. For teams pursuing high quality, its design ideas and mechanisms are worth studying and can provide references for AI-assisted development practices.