# RLM Skill: A Practical Implementation of Recursive Language Models in Claude Code

> A Claude Code skill that implements the ideas from the Recursive Language Models paper, splitting large file processing into multiple sub-LLM calls via Python REPL to keep the main model's context lightweight at all times.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-24T21:42:19.000Z
- 最近活动: 2026-04-24T21:52:08.184Z
- 热度: 148.8
- 关键词: Claude Code, Recursive Language Models, 大上下文处理, 多 Agent 协作, Python REPL, 提示词缓存, 成本优化
- 页面链接: https://www.zingnex.cn/en/forum/thread/rlm-skill-claude-code
- Canonical: https://www.zingnex.cn/forum/thread/rlm-skill-claude-code
- Markdown 来源: floors_fallback

---

## RLM Skill Introduction: Practical Implementation of Recursive Language Models in Claude Code

RLM Skill is a Claude Code skill that implements the ideas from the Recursive Language Models paper. It splits large file processing into multiple sub-LLM calls via Python REPL, keeping the main model's context lightweight at all times, solving the problem of context degradation in large models while optimizing cost and processing efficiency.

## Background: Issues with Large Context Models and Paper Ideas

### Hidden Costs of Large Context Models
Current large language models support large context windows, but they face the problem of "context degradation": information interference, loss of details, and reduced reasoning quality.
### Paper Ideas
In January 2026, Zhang, Kraska, and Khattab published "Recursive Language Models" on arXiv, proposing to split semantic tasks into multiple low-cost sub-LLM calls, with the main model coordinating sub-tasks to maintain a lightweight context.

## Project Overview: Core Design of rlm-skill

vladcioaba/rlm-skill is a Claude Code implementation of this idea, a lightweight Python REPL wrapper. Core design steps:
1. User specifies files/directories
2. Start a persistent Python REPL session and bind input
3. Claude writes code to slice, filter, and chunk data
4. Initiate parallel sub-LLM calls via `llm_query_batch()`
5. Save results; the main model only sees metadata and limited stdout
This model changes the way large files are processed, replacing the traditional method of directly stuffing large files into prompts.

## Technical Implementation Details: Zero Dependencies, Parallelism, and Optimization

### Zero-Dependency Runtime
Implemented with pure standard libraries, only dependent on the Anthropic HTTP API (called via urllib), ready to use after cloning.
### Parallel Sub-Calls
`llm_query_batch()` in `rlm_helper.py` supports 20 concurrent sub-calls by default (asynchronous via concurrent.futures), using the Haiku model for sub-calls to reduce costs.
### Prompt Caching
System prompts and shared prefix caching are enabled by default, saving about 10x the cost.
### Budget Control
Built-in soft warnings and hard limits for the number of calls and token consumption to prevent tail cost overruns.
### Directory Processing
Intelligently traverses directories, ignores irrelevant directories like .git, and supports glob filtering.

## Usage and Workflow

### Installation
Run `./install.sh` to create a symbolic link in `~/.claude/skills/`.
### Usage Example
Enter in a Claude Code session: `Summarize every error pattern in /var/log/big.log using the rlm skill.`
### Automatic Workflow
1. Start session: `rlm_repl.py start --input /var/log/big.log`
2. Iterative processing: `rlm_repl.py exec --session` to perform chunking and sub-calls
3. Get results: Save with `FINAL_VAR("report")`, read with `rlm_repl.py final`
The process is transparent to the user, and Claude automatically generates the logic.

## Applicable Scenarios and Core Value

### Applicable Scenarios
- Log analysis: Process GB-level logs, extract error patterns and statistical indicators
- Code review: Scan large codebases, identify issues or generate summaries
- Document processing: Analyze large volumes of documents, extract structured information
- Data cleaning: Process large CSV/JSON files, transform and validate
### Core Value
Keep the main model's context lightweight to avoid degradation; parallel sub-calls improve efficiency; caching and budget control optimize costs.

## Summary and Implementation Differences

### Implementation Differences
Differences from the paper's reference code:
- Uses Anthropic Messages API (Haiku by default) for sub-calls instead of OpenAI/Fireworks
- Implements parallelism with concurrent.futures instead of the paper's async mode
- Enables prompt caching by default
- Only supports recursion depth 1 (consistent with the paper)
### Summary
rlm-skill is an elegant implementation of the paper's ideas, focusing on solving large file processing problems in Claude Code and providing an elegant workflow for scenarios like log analysis and code review.