Zing Forum

Reading

LLM Wiki: A Persistent Knowledge Base Construction Workflow Inspired by Karpathy

A multi-agent compatible workflow based on Andrej Karpathy's ideas, transforming raw materials into an LLM-maintained Markdown knowledge base that supports knowledge compounding and audit trails.

知识库MarkdownRAGAndrej Karpathy智能体OpenClawClaude Code知识管理
Published 2026-04-29 03:45Recent activity 2026-04-29 03:52Estimated read 6 min
LLM Wiki: A Persistent Knowledge Base Construction Workflow Inspired by Karpathy
1

Section 01

LLM Wiki: Introduction to the Persistent Knowledge Base Construction Workflow Inspired by Karpathy

LLM Wiki is a multi-agent compatible workflow based on Andrej Karpathy's ideas. It transforms raw materials into an LLM-maintained Markdown knowledge base, supporting knowledge compounding and audit trails. Complementary to traditional RAG, it is suitable for scenarios like in-depth research and long-term projects. Core principles include Markdown-first and traceable sources.

2

Section 02

From RAG to Knowledge Compounding: Background of the Conceptual Shift

Traditional RAG workflows retrieve document fragments temporarily for each query, with no persistent structure, so models have to re-understand the same content repeatedly. The core idea of LLM Wiki comes from Karpathy's gist document: let LLMs incrementally build and maintain a persistent, interlinked Markdown wiki to achieve knowledge compounding, instead of starting from scratch every time.

3

Section 03

Detailed Architecture of the LLM Wiki Workflow

LLM Wiki defines a clear workflow:

  • Raw Data Layer (raw/): Stores immutable input materials (PDFs, web pages, etc.) as trusted sources.
  • Wiki Layer (wiki/): Core accumulation area, including source pages (structured summaries + traceable links), entity pages (key people/organizations/concepts), concept pages (cross-source abstract integration), comprehensive pages (multi-source in-depth analysis), question pages (unsolved/solved questions), indexes (structured directories), and logs (change history).
4

Section 04

Agent-Agnostic Design and Core Principles

Agent-Agnostic Design: Convert the core Markdown workflow to native formats of various agents (e.g., OpenClaw, Claude Code, Codex) via adapters to avoid tool lock-in. The knowledge base remains in Markdown format (human-readable, version-controllable, migratable). Core Principles: Markdown-first (plain text, Git-friendly), traceable sources (pages link to raw materials), knowledge accumulation (temporary outputs converted to persistent knowledge), auditability (track evolution via index logs).

5

Section 05

Practical Usage Flow of LLM Wiki

Typical usage flow:

  1. Place new materials into the raw/ directory
  2. Instruct the agent to "ingest" the source
  3. The agent creates/updates relevant pages in wiki/
  4. Update wiki/index.md and append to wiki/log.md
  5. Save valuable answers as comprehensive pages
  6. Regularly run structure health check scripts to verify integrity The project includes a demo vault (demo-vault) showing the complete conversion process.
6

Section 06

Complementary Relationship Between LLM Wiki and RAG

LLM Wiki does not replace RAG; instead, it complements it:

  • RAG is suitable for: Quick Q&A, temporary queries, instant access to latest information
  • LLM Wiki is suitable for: In-depth research, long-term projects, knowledge domains requiring continuous accumulation and repeated references The two can be combined: RAG handles real-time retrieval, while LLM Wiki manages organized and verified core knowledge.
7

Section 07

Applicable Scenarios and Project Status of LLM Wiki

Applicable Scenarios: Academic research (literature reviews, research question tracking), investment decisions (industry insights, argument formation), product development (competitor analysis, user research), personal learning (cross-domain lifelong learning notes). Project Status: Current version is 0.3.0 draft, providing multi-agent starter packs, shared templates, verification scripts, and demo vaults; the roadmap includes richer OpenClaw integration, structured metadata, automatic reports, CLI tools, release automation, etc.