Reading

Agentic Development Playbook: A Standardized Workflow for AI Programming

A standard-driven workflow for AI programming agents that addresses common issues like context loss and task drift through structured documents and decision logs

AI编程工作流规范驱动Claude CodeCursor代码审查决策日志任务管理

Published 2026-06-11 04:45Recent activity 2026-06-11 04:50Estimated read 8 min

Agentic Development Playbook: A Standardized Workflow for AI Programming

Section 01

Agentic Development Playbook: Introduction to the Standard-Driven AI Programming Workflow

This project is maintained by manjast on GitHub (Original Link) and is a standard-driven workflow framework for AI programming agents (such as Claude Code, Cursor, etc.). Its core idea is AI-assisted programming requires discipline—it solves common issues like context loss, task drift, and code review difficulties through structured documents (e.g., decision logs, task lists), making the code repository the single source of truth.

Section 02

Core Problem Background

Common issues in using AI programming agents include:

Decision Loss: Early decisions in long conversations are easily buried, forgetting the reasons for technical choices;
Task Disconnection: Traditional TODO lists become outdated, and task status is unclear when AI processes in parallel;
Difficult Code Review: AI generates large changes at once, making differences too big to review easily;
Lack of Evidence: Performance claims in ML projects lack reproducible evidence;
Template Sprawl: There are many outdated project templates, and new members don't know how to choose.

Section 03

Solutions and Work Paths

Solutions to Core Problems

Decision Loss: Use DECISIONS.md to append records of each decision (problem, solution, result, follow-up, date);
Task Disconnection: TASKS.md with four-column structure (In Progress/Ready/Blocked/Completed) + STATUS.md to track status;
Difficult Code Review: task-card.md clarifies task scope (In/Out of scope) + one task per commit;
Lack of Evidence: GATES.ml-eval.md structured evaluation framework + run-manifest.json to record runtime metadata;
Template Sprawl: Limit to 15 templates (13 user + 2 evaluation) and maintain via consistency checks.

Two Work Paths

Core Path: Suitable for scenarios with clear standards, core documents include AGENTS.md, TASKS.md, etc., following rules like "repository as single source of truth" and "task atomization";
PoC/Evaluation Path: Suitable for decision verification scenarios, with dedicated templates like POC-BRIEF.md, REPORT.md, etc., helping to clarify decision problems and keep summaries concise.

Section 04

Consistency Check Mechanism

The project runs automated consistency checks via CI (completed in <5 seconds) to verify the following structural integrity:

Whether template fields are complete;
Whether run-manifest.json conforms to the reproducible format;
Whether GATES.ml-eval.md contains 7 required sub-checks.

This check only verifies structure, not content quality (content is reviewed by humans), ensuring documents comply with standards.

Section 05

Practical Application Scenarios

Individual Developers

Record decision history to avoid confusion like "why did I write this way back then";
Clarify task boundaries to prevent AI from over-engineering;
Transparent project status to keep track of progress at any time.

Team Collaboration

Unify communication protocols to reduce context friction;
Atomic tasks support asynchronous code review;
Documented paths help new members get up to speed quickly.

ML Projects

Structured experiment records and reproducible runtime environments;
Clearly present decision evidence to ensure experiments are traceable.

Section 06

Relationship with Other Tools

This Playbook is NOT:

A requirement/specification generation system;
A CLI or automation framework;
A multi-agent orchestration product;
A complete methodology covering all stages.

It is a tool-agnostic lightweight convention layer that can be used with any AI programming tools like Claude Code, Cursor, GitHub Copilot, etc.

Section 07

Summary and Reflections

The Agentic Development Playbook marks the evolution of AI-assisted programming from "letting AI write code" to "standardized pair programming". The core insight is: Tools themselves cannot guarantee quality—discipline can.

Its value lies not only in providing templates but also in the way of thinking it advocates:

Explicit and traceable decisions;
Atomic and reviewable tasks;
Ensuring quality through structural constraints rather than bureaucratic processes.

For teams and developers exploring best practices for AI programming, it is a well-thought-out starting point. Although not a silver bullet, it effectively solves real problems.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

libmlxforge: An Embedded MLX LLM Inference Engine for Apple Silicon

libmlxforge is an embeddable MLX large language model (LLM) inference engine designed specifically for Apple Silicon. It provides a unified C ABI interface, supports calls from Node.js, Swift, and Rust, and features continuous batching, streaming output, JSON-constrained structured output, and embedding vector generation.

Recent activity 2026-06-09 17:23