Zing Forum

Reading

docstring-generator: An Automatic Python Docstring Generation Tool Based on AST and Local LLM

A Python docstring generation tool that leverages Abstract Syntax Tree (AST) parsing and local Large Language Model (LLM) inference, supporting three mainstream docstring styles: Google, NumPy, and Sphinx.

Python文档字符串AST本地LLM代码文档Google风格NumPy风格SphinxAI辅助开发开源项目
Published 2026-05-26 22:07Recent activity 2026-05-26 22:29Estimated read 6 min
docstring-generator: An Automatic Python Docstring Generation Tool Based on AST and Local LLM
1

Section 01

Introduction: Core Overview of the docstring-generator Tool

docstring-generator: An Automatic Python Docstring Generation Tool Based on AST and Local LLM

Core Points: This is a Python docstring generation tool that combines Abstract Syntax Tree (AST) parsing with local Large Language Model (LLM) inference. It supports three mainstream docstring styles—Google, NumPy, and Sphinx—addressing pain points like time-consuming manual docstring writing and poor consistency, while ensuring code privacy (via local LLM).

2

Section 02

Background and Motivation: Challenges in Python Documentation Writing and AI Opportunities

Background and Motivation

Current State and Challenges of Python Documentation

  • Diverse Styles: There are three mainstream styles (Google, NumPy, Sphinx), leading to choice difficulties and learning costs.
  • Pain Points of Manual Writing: Time-consuming and labor-intensive, hard to ensure consistency, heavy maintenance burden, and uneven quality.

AI-Assisted Opportunities

With the development of LLM technology, automatic docstring generation has become possible. However, cloud services have privacy and cost issues, so local LLM inference is a better choice.

3

Section 03

Technical Implementation: Integration of AST Parsing and Local LLM

Detailed Technical Implementation

AST Parsing Module

Extracts metadata like function signatures, parameters, and return value types via AST. Its advantages include accuracy, completeness, reliability, and extensibility.

Local LLM Integration

  • Model Selection: Supports backends like llama.cpp, Ollama, Transformers, and vLLM.
  • Prompt Engineering: Uses templates to guide LLM in generating style-compliant documentation.
  • Context Management: Collects class/module context and type definitions to improve accuracy.\n

Docstring Style Implementation

Provides example code for Google, NumPy, and Sphinx styles to meet different project needs.

4

Section 04

Core Features and Application Scenarios

Core Features and Application Scenarios

Core Features

  • Batch Generation: Recursively scans projects, smart filtering, incremental updates.
  • Interactive Generation: Preview, edit, selective generation.
  • Quality Check: Completeness, consistency, expiration detection, quality scoring.
  • Configuration Customization: Project-level configuration (pyproject.toml) and command-line options.

Application Scenarios

  • Legacy Code Documentation: Quicklyly generate basic documentation.
  • New Project Initiation: Establish norms and reduce technical debt.
  • API Library Development: Generate standardized API documentation.
  • Team Collaboration: Unify styles and improve review efficiency.
5

Section 05

Technical Advantages and Usage Recommendations

Technical Advantages and Usage Recommendations

Technical Advantages

  • Privacy Protection: Code is not uploaded to the cloud, suitable for sensitive projects.
  • Cost-Effectiveness: No API fees, one-time model download.
  • Customizability: Custom prompt templates, supports model fine-tuning.

Usage Recommendations

  • Model Selection: Lightweight (7B) for individuals, medium (13B) for balance between quality and speed, large (30B+) for high-demand scenarios.
  • Best Practices: Gradual adoption, manual review, continuous maintenance.
6

Section 06

Summary and Outlook

Summary and Outlook

Summary

docstring-generator combines AST and local LLM to solve docstring writing pain points, provides multi-style support, and ensures privacy and flexibility.

Outlook

Future plans include supporting more programming languages, smarter context understanding, deep IDE integration, and fine-grained quality assessment to provide developers with a better docstring generation experience.