Reading

AST-Analyzer: A Precise Code Context Extraction Engine for LLMs

A static analysis tool developed in Go, which precisely extracts symbol-related context information from TypeScript codebases through AST parsing and call graph analysis, replacing the traditional approach of "stuffing entire files into prompts".

静态分析AST代码上下文提取LLM工具链TypeScript调用图Tree-sitter代码理解

Published 2026-05-26 12:14Recent activity 2026-05-26 12:19Estimated read 6 min

AST-Analyzer: A Precise Code Context Extraction Engine for LLMs

Section 01

AST-Analyzer: Introduction to the Precise Code Context Extraction Engine for LLMs

AST-Analyzer is a static analysis tool developed in Go. It precisely extracts symbol-related context information from TypeScript codebases through AST parsing and call graph analysis, replacing the traditional approach of "stuffing entire files into prompts". This project is maintained by jairo-litman, with source code hosted on GitHub (link: https://github.com/jairo-litman/ast-analyzer), and was released on May 26, 2026.

Section 02

Project Background and Motivation

When providing code context to LLMs, developers face a dilemma: stuffing entire files into prompts leads to irrelevant content flooding the context window, while manual selection is time-consuming and prone to missing key dependencies. AST-Analyzer was initiated by students from the São Paulo State University (UNESP) in Brazil to address this pain point, providing a "surgical precision" code extraction solution—given a target symbol, it returns its definition body, the header of its containing class declaration, the signatures of callers and callees, referenced type declarations, and file import statements.

Section 03

Core Technical Architecture

Dual Graph-Driven Dependency Analysis

Call Graph: Traverses TypeScript/TSX projects via the Tree-sitter parser to identify call relationships between functions, methods, and classes, answering "who calls me" and "who do I call".
Type Reference Graph: Tracks dependency relationships of explicit and inferred types, revealing implicit type contracts.

Incremental Indexing Mechanism

Uses SQLite for persistent storage of parsing results, only re-parsing files with changed content hashes, adapting to daily use of large codebases.

Section 04

Features and Usage

Four-Step Workflow

Index: Scan the project to build call graphs and type graphs
List: View all symbols and their IDs
Extract: Output context in specified format based on symbol ID
Listen: Real-time synchronization of file changes in development mode

Output Formats

Supports three formats: JSON (structured), Redacted (multi-file source code view), and Markdown (directly usable for LLM prompts).

Slice Control Parameters

Precisely control the extraction scope via --caller-depth/--callee-depth, --caller-bodies-up-to/--callee-bodies-up-to, --type-depth, and --max-per-level.

Section 05

Practical Application Scenarios

Code Review Assistance: Quickly obtain the complete context of a function (callers, callees, type definitions), which is more efficient than manual IDE navigation.
LLM Code Generation: Extract just the right context to help the model understand code intent while avoiding irrelevant details.
Legacy Code Analysis: Visualize dependency relationships via call graphs to quickly understand large, underdocumented codebases.

Section 06

Technical Implementation Highlights

Tree-sitter Parsing: Balances speed and fault tolerance; even if the code has syntax errors, it can still extract most valid information.
Complete Import Parsing: Supports complex scenarios such as tsconfig path aliases, default imports, namespace imports, and re-exports.
Class Inheritance Chain Handling: Automatically parses the inheritance chain of class methods, correctly identifying parent class definitions for this and super calls.

Section 07

Project Significance and Insights

AST-Analyzer represents a smarter approach to code context management, providing a precise code information transfer solution for the LLM toolchain. Its open-source implementation offers an extensible foundation for the community; future improvements could explore refining extraction strategies, supporting more languages, or integrating into IDE plugins.