Zing Forum

Reading

llm-doc-generator: A Complete Solution for Automatically Generating Code Documentation Using Large Language Models

A full-stack web application that supports automatic generation of structured Markdown documentation from any Git repository, integrating multiple LLM providers, real-time progress streaming, intelligent deduplication, and wide language support.

LLM文档生成代码文档AngularSpring BootGit自动化文档OpenAIClaudeOllama
Published 2026-04-11 20:06Recent activity 2026-04-11 20:18Estimated read 6 min
llm-doc-generator: A Complete Solution for Automatically Generating Code Documentation Using Large Language Models
1

Section 01

Introduction: llm-doc-generator—An AI-Driven Solution for Automatic Code Documentation Generation

Introduction: llm-doc-generator—An AI-Driven Solution for Automatic Code Documentation Generation

llm-doc-generator is a full-stack web application that supports automatic generation of structured Markdown documentation from any Git repository. It integrates core features such as multiple LLM providers (OpenAI, Claude, Ollama), real-time progress streaming, intelligent deduplication, and wide language support. It aims to solve the pain points of time-consuming documentation writing and difficulty in synchronization in software development, providing an efficient documentation generation solution for teams and open-source projects.

2

Section 02

Project Background and Core Design Philosophy

Project Background and Core Design Philosophy

Traditional documentation generation tools can only extract code comments or function signatures, lacking context and architectural explanations; manual documentation writing is time-consuming and easily disconnected from code. The core philosophy of llm-doc-generator is to let AI understand code rather than just parse it. The project adopts a full-stack architecture of Angular 21 frontend + Spring Boot 4.0.3 backend, inspired by ReadMeReady but fully enhanced in functionality and architecture.

3

Section 03

Multi-LLM Support and Flexible Choices

Multi-LLM Support and Flexible Choices

The project natively supports OpenAI GPT series, Anthropic Claude series, and local Ollama models. Users can choose according to their needs (e.g., local Ollama ensures privacy and reduces costs). The backend implements unified abstraction through Spring AI 2.0-M2, making it easy to add new models. Ollama uses gemma3 by default and can be configured with other models.

4

Section 04

Real-Time Progress Streaming and User Experience Optimization

Real-Time Progress Streaming and User Experience Optimization

Documentation generation may take time. The project implements real-time progress streaming via Server-Sent Events (SSE), allowing users to view the current file being analyzed, progress percentage, and estimated remaining time. The frontend uses RxJS 7.8 to handle reactive data streams and also provides a job history feature for easy browsing of task statuses and results.

5

Section 05

Intelligent Deduplication and Caching Mechanism

Intelligent Deduplication and Caching Mechanism

To avoid repeated LLM calls, the system checks if the repository URL and commit SHA are in the cache (if not expired, it directly returns the result), reducing costs and shortening response time. The system automatically cleans up old jobs older than 24 hours to prevent storage bloat.

6

Section 06

Multi-Language Support, Custom Prompts, and Deployment Solutions

Multi-Language Support, Custom Prompts, and Deployment Solutions

Supports multiple programming languages such as Java, Kotlin, TypeScript, Python (identified via file extensions). Users can customize prompt templates to adapt to project requirements (e.g., security compliance, API examples). Deployment supports Docker Compose (PostgreSQL17, Spring Boot 8080, Angular 4200) and local development (requires Java21, Maven3.9+, etc.).

7

Section 07

Security Considerations and Future Development Directions

Security Considerations and Future Development Directions

Currently a school/demo project, the API has no authentication; production environments need to add authentication and authorization. Future plans: introduce RAG technology (vector database to store code embeddings), model fine-tuning, CI/CD pipelines, UI/UX improvements (dark mode, PDF export, etc.).

8

Section 08

Summary and Application Scenarios

Summary and Application Scenarios

llm-doc-generator is suitable for quickly generating overviews of new codebases, open-source project documentation, code review change notes, or as part of a CI process to automatically update documentation. It combines LLM capabilities with software engineering practices, effectively improving developer efficiency.