RLM Skill
A plugin for Claude Code and OpenCode implementing the Recursive Language Model (RLM) pattern from MIT's paper (arXiv:2512.24601). Instead of stuffing massive data into the context window, treat it as an external variable in a REPL and write code to programmatically examine, decompose, and search it. Only results enter context.
Install
Claude Code
/plugin marketplace add lets7512/rlm-skill
/plugin install rlm@rlm-skill
Restart Claude Code. Done.
Or load for a single session:
claude --plugin-dir /path/to/rlm-skill
OpenCode
# Copy agents, commands, and plugins to your project or user config
cp -r .opencode/agents/ .opencode/commands/ .opencode/plugins/ ~/.config/opencode/
# Or use from project root (auto-detected from .opencode/)
# Install plugin dependencies
cd .opencode && npm install
The RLM interceptor plugin (plugins/rlm-interceptor.ts) automatically rewrites large file reads into metadata summaries and provides custom tools for sandbox execution and knowledge base search.
In OpenCode, use Ctrl+K then select:
project:rlm — invoke RLM for large-context tasks
project:rlm-stats — show token savings dashboard
@rlm — invoke the RLM agent directly
RLM CLI (optional, for massive datasets)
uv pip install -e .
Usage
The skill activates automatically when large-context tasks are detected. The interceptor silently rewrites tool calls to prevent raw data from entering context.
Invoke directly in Claude Code:
/rlm:rlm analyze this 500MB log file for error patterns
Invoke in OpenCode:
@rlm analyze this 500MB log file for error patterns
Check token savings:
/rlm:stats
Custom Tools
The plugin provides MCP tools (Claude Code) and custom tools (OpenCode):
| Tool | Purpose |
|---|
rlm_execute | Run code in a sandboxed subprocess (python/js/shell). Only stdout enters context. |
rlm_execute_file | Run code against a file. Content loaded as FILE_CONTENT variable, never enters context. |
rlm_index | Index file content into an FTS5 knowledge base for later search. |
rlm_search | Search indexed content with smart snippets + BM25 ranking + 3-layer fallback (porter/trigram/fuzzy). |
rlm_batch_execute | Run multiple commands + search queries in ONE call. Saves tool-call overhead. |
rlm_fetch_and_index | Fetch URL, convert HTML to text, chunk and index. Raw page never enters context. |
rlm_stats | Show knowledge base statistics (indexed sources, chunk count, DB size). |
Smart Snippets
Search results use smart snippet extraction — instead of returning full chunks, the search highlights windows around matching query terms with ... context bridges. This minimizes tokens while preserving relevance.
3-Layer Search Fallback
- Porter stemming (FTS5 default) — handles word variants (
running matches run)
- Trigram substring — catches partial matches (
config matches configuration)
- Fuzzy Levenshtein — tolerates typos (
reuslt matches result)
Results are merged and deduplicated with BM25 ranking.
Batch Execute
rlm_batch_execute accepts multiple shell commands and search queries in a single tool call:
{
"commands": [
{ "language": "python", "code": "import json; print(len(json.load(open('data.json'))))" },
{ "language": "shell", "code": "wc -l *.log" }
],
"queries": ["error handling", "timeout config"]
}
This saves tool-call overhead when you need to run several operations at once.
Fetch and Index
rlm_fetch_and_index downloads a URL, converts HTML to clean text (stripping scripts, styles, and tags), chunks the content, and indexes it into the FTS5 knowledge base. The raw page never enters context:
{
"url": "https://docs.example.com/api-reference",
"source": "API Docs"
}
Then use rlm_search with source: "API Docs" to query specific sections.
CLI Tool
For truly massive datasets (50MB+) that need recursive sub-LLM decomposition:
# Analyze a repo
rlm-cli query "Find all security issues" --repo /path/to/repo --stats
# Interactive REPL
rlm-cli repl --file /path/to/data.json
# With local vLLM
rlm-cli query "Find bugs" --repo . --backend openai --model Qwen/Qwen3-8B --base-url http://localhost:8000/v1
Hooks & Interceptors
Claude Code — PreToolUse hook (hooks/pretooluse-rlm.mjs) fires before Read, Bash, and WebFetch tool calls. Uses updatedInput for silent rewriting — the model sees the rewritten result without knowing the original was intercepted.
OpenCode — Plugin interceptor (plugins/rlm-interceptor.ts) uses tool.execute.before to silently rewrite large file reads into metadata scripts via output.args modification.