Reading

AgentJIT: A JIT Compiler That Compiles Repetitive AI Workflows into Zero-Token Skills

An in-depth analysis of how AgentJIT automatically compiles time-consuming random workflows into deterministic zero-token skills by observing the execution traces of AI agents, achieving an efficiency leap from "ten thousand deliberations" to "hundreds of invocations."

JIT编译器AI智能体工作流优化Token效率Claude Code自动化

Published 2026-04-10 16:12Recent activity 2026-04-10 16:24Estimated read 7 min

Section 01

Guide / Main Floor: AgentJIT: A JIT Compiler That Compiles Repetitive AI Workflows into Zero-Token Skills

Section 02

Problem Background: Repetitive Costs of AI Workflows

Modern AI coding assistants (such as Claude Code, Codex, Cursor, etc.) have profoundly changed the way developers work. They can understand natural language instructions, execute complex tool call sequences, and complete various tasks from code review to deployment and operation. However, this capability comes at a cost: each task execution requires multiple rounds of LLM inference, consuming a large number of tokens and generating significant delays and costs.

Consider a typical DevOps scenario: troubleshooting service failures. An agent may need to perform the following steps:

Run kubectl get pods to check pod status
Execute kubectl logs to get logs of a specific pod
Use grep to search for error keywords in the logs
Infer possible causes based on error information
Modify configuration files or restart services
Verify the repair results

If such failures occur frequently (e.g., caused by a known bug), the agent will repeat the same reasoning process every time. While this has become "muscle memory" for humans, for AI, each time is a new deliberation, consuming about 10,000 tokens and more than 30 seconds.

AgentJIT's core insight is: these repetitive workflows can be learned, compiled, and optimized, eventually transformed into near-instant deterministic execution.

Section 03

Core Concept: Transition from Randomness to Determinism

AgentJIT (Agent Just-In-Time Compiler) is a background JIT compiler designed specifically for autonomous programming agents. It achieves a qualitative change in efficiency through the following mechanisms:

Section 04

Observation and Learning

AgentJIT silently observes the agent's execution traces through the hook mechanism of the agent framework (such as Claude Code's PostToolUse, SessionStart, SessionEnd hooks). It does not interfere with normal operations, but faithfully records tool call sequences, parameter patterns, and execution results.

This observation is continuous. As the agent performs more tasks, AgentJIT accumulates more execution data and identifies repeatedly occurring patterns.

Section 05

Pattern Recognition and Compilation

When a workflow is identified as a "hotspot" (i.e., repeated execution reaches a certain threshold), AgentJIT triggers the compilation process. The compiler analyzes the execution traces, extracts parameterizable parts, and generates a deterministic skill script.

Compiled skills have the following characteristics:

Parameterized: Variable parts in the script are extracted as parameters, enabling it to adapt to similar but slightly different scenarios
Deterministic: The execution path is predefined, no longer requiring LLM inference
Zero-Token: The execution process does not involve any LLM calls, only executing local commands

Section 06

Runtime Execution and Fallback

Compiled skills are stored in the local file system (default location ~/.aj/skills/). When the agent encounters a matching scenario again, AgentJIT intercepts the request and directly executes the corresponding skill script.

Execution results are monitored. If the skill execution fails, the system will silently fall back to the original LLM workflow, and the user will not perceive any interruption. This fault-tolerant design ensures the reliability of the system.

Section 07

Detailed System Architecture

AgentJIT's architecture consists of four core components, forming a complete feedback loop:

Section 08

Hook Event Layer

This is the entry point for integration with the agent framework. Currently, it supports Claude Code's native hooks, and future plans include expanding to Codex, Gemini CLI, GitHub Copilot, Cursor, etc.

Hook types include:

PostToolUse: Triggered after each tool call, recording call details
SessionStart/SessionEnd: Mark session boundaries and aggregate session-level statistics

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15