# Second Brain: An LLM Experiment Platform for SFT, RLHF, and RAG

> An LLM experiment environment designed specifically for AI engineers and researchers, supporting end-to-end experiments for Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Retrieval-Augmented Generation (RAG). It features parallel inference, blind test evaluation, and dataset generation capabilities.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-20T16:43:53.000Z
- 最近活动: 2026-04-20T16:50:49.939Z
- 热度: 145.9
- 关键词: LLM, SFT, RLHF, RAG, FastAPI, pgvector, 模型评估, 数据集生成, 盲测, 领域驱动设计
- 页面链接: https://www.zingnex.cn/en/forum/thread/second-brain-sftrlhfragllm
- Canonical: https://www.zingnex.cn/forum/thread/second-brain-sftrlhfragllm
- Markdown 来源: floors_fallback

---

## [Overview] Second Brain: An LLM Experiment Platform for SFT, RLHF, and RAG

Second Brain is an open-source LLM experiment environment designed for AI engineers and researchers, integrating end-to-end experiments for Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Retrieval-Augmented Generation (RAG). It addresses efficiency issues in traditional fragmented workflows and provides core features such as parallel inference, blind test evaluation, and dataset generation to support systematic model experimentation and optimization.

## Background: Pain Points of Traditional LLM Experiments and Platform Design Philosophy

Traditional LLM experiments require switching between multiple tools (script API calls, spreadsheet evaluation recording, text editor data organization), leading to low efficiency and easy introduction of human errors. Second Brain follows the philosophy of 'from test console to scientific laboratory' and encapsulates the entire experiment workflow in a unified web application. The platform uses a Domain-Driven Design (DDD) architecture, with the backend based on FastAPI, the data layer using PostgreSQL+pgvector for vector search support, and the frontend implementing mathematical formula rendering.

## Core Features: Parallel Inference, Blind Test Evaluation, and Data Closed-Loop

1. **Deterministic Parallel Inference**: Simultaneously send requests to two models/prompts, ensuring consistent parameters and RAG context, eliminating time noise, and enhancing the scientific rigor of A/B testing;
2. **Blind Test Evaluation Mechanism**: Hide the real identity of models, only display 'Model A/B', avoid brand bias, and make evaluations more objective;
3. **Semantic-Level Text Comparison**: Highlight output differences based on jsdiff, making it easy to identify details like hallucinations and omissions;
4. **Gold Standard Dataset Export**: Automatically export evaluation results in JSONL format, compatible with mainstream training frameworks, and shorten the experiment-training closed-loop time;
5. **Advanced RAG Pipeline**: Support metadata pre-filtering (document chapters, dates, etc.) + vector re-ranking, with deterministic sorting ensuring experiment reproducibility.

## Technical Architecture: DDD Layered Design and Scalability

The platform uses a DDD layered architecture, with code divided into 5 layers:
- **api/**: FastAPI routing layer for handling HTTP requests;
- **core/**: Configuration and environment variable management;
- **repositories/**: Database interaction layer, encapsulating pgvector semantic search;
- **schemas/**: Pydantic models responsible for data validation and serialization;
- **services/**: Core business logic (LLM orchestrator, RAG Pipeline).
The LLM orchestration layer uses an abstract interface, supporting Ollama local models by default, and reserves interfaces for extending other providers, facilitating maintenance and expansion.

## Applicable Scenarios: Covering Full-Cycle Needs of Model Development

Second Brain is suitable for the following scenarios:
- Data preparation before model fine-tuning: Collect human preferences through blind tests and generate RLHF comparison data pairs;
- Prompt engineering optimization: Parallel comparison of different prompt effects for data-driven decision-making;
- RAG system tuning: Test the impact of different retrieval strategies and re-ranking algorithms;
- Model capability benchmarking: Establish an internal evaluation system to track iteration progress.

## Conclusion: The Value of an Engineering Experiment Platform

Second Brain elevates LLM experiments from ad-hoc scripts to an engineering platform level. It is not just a collection of tools but a complete methodology. Every link from experiment design to data output is carefully polished, helping teams systematically improve model performance. It is an open-source project worth exploring.
