Distills uncached files into the Recursive Language Model(RLM) Summary cache Ledger. You (the agent) ARE the distillation engine. Read each file deeply, write a high-quality 1-sentence summary, inject it via inject_summary.py. The purpose is if you read the full file once and produce a great summary once it will avoid the need to read the file every time you need to know what the script does or what the details of the file are. most cases the RLM summary should be sufficient. Use when files are missing from the ledger and need to be summarized. <example> user: "Summarize these new plugin files into the RLM ledger" assistant: "I'll use rlm-distill-agent to read and summarize each file into the cache." </example> <example> user: "The RLM ledger is missing 40 files -- fill the gaps" assistant: "I'll use rlm-distill-agent to process the missing files." </example>
From rlm-factorynpx claudepluginhub richfrem/agent-plugins-skills --plugin rlm-factoryThis skill is limited to using the following tools:
assets/diagrams/rlm_late_binding_flow.mmdassets/diagrams/rlm_tool_enrichment_flow.mmdevals/evals.jsonevals/results.tsvreferences/BLUEPRINT.mdreferences/RLM_ARCHITECTURE.mdreferences/diagrams/distillation_process.mmdreferences/diagrams/logic.mmdreferences/diagrams/rlm-factory-architecture.mmdreferences/diagrams/rlm-factory-architecture.pngreferences/diagrams/rlm-factory-dual-path.mmdreferences/diagrams/rlm-factory-dual-path.pngreferences/diagrams/rlm-factory-workflow.mmdreferences/diagrams/rlm_late_binding_flow.mmdreferences/diagrams/rlm_mechanism_workflow.mmdreferences/diagrams/rlm_mechanism_workflow.pngreferences/diagrams/rlm_tool_enrichment_flow.mmdreferences/diagrams/search_process.mmdreferences/diagrams/unpacking.mmdreferences/diagrams/workflow.mmdGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Optimizes cloud costs on AWS, Azure, GCP via rightsizing, tagging strategies, reserved instances, spot usage, and spending analysis. Use for expense reduction and governance.
This skill requires Python 3.8+ and standard library only. No external packages needed.
To install this skill's dependencies:
pip-compile ./requirements.in
pip install -r ./requirements.txt
See ./requirements.txt for the dependency lockfile (currently empty — standard library only).
You ARE the distillation engine. You replace the local Ollama distiller.py script with your own
superior intelligence. Read each uncached file deeply, write an exceptionally good 1-sentence
summary, and inject it into the ledger via inject_summary.py.
Never run distiller.py. You are faster, smarter, and don't require Ollama to be running.
inventory.py)First-time setup or missing profile? Run the rlm-init skill first:
# See: ../SKILL.md
# Creates rlm_profiles.json, manifest, and empty cache
python3 ./scripts/inventory.py --profile project
python3 ./scripts/inventory.py --profile tools
Read the entire file with view_file. Do not skim.
A great RLM summary answers: "What does this file do, what problem does it solve, and what are its key components/functions?" in one dense sentence.
python3 ./scripts/inject_summary.py \
--profile project \
--file ../SKILL.md \
--summary "Provides atomic vault CRUD operations for Obsidian notes using POSIX rename and fcntl.flock."
The script uses fcntl.flock for safe concurrent writes. Never write to the JSON directly.
Do not attempt manual distillation for large batches. Choose an engine based on cost and throughput, then delegate to the agent swarm:
| Engine | Model | Cost | Workers | Best For |
|---|---|---|---|---|
--engine copilot | gpt-5-mini (nano tier) | $0 free | --workers 2 (rate-limit safe) | Bulk summarization, zero-cost default |
--engine gemini | gemini-3-pro-preview | $0 free | --workers 5 | Large-context batches, higher throughput |
--engine claude | Haiku / Sonnet | Low-Medium | --workers 3 | Higher quality summaries, not free |
| Local Ollama | granite3.2:8b | $0 (CPU) | 1 (serial) | Offline / air-gapped only |
Default recommendation: start with --engine copilot (free, no rate risk at workers=2).
Switch to --engine gemini --workers 5 if you need faster throughput.
Delegate to the agent-loops:agent-swarm skill with the appropriate engine and job:
| Engine | Model | Cost | Workers | Best For |
|---|---|---|---|---|
copilot | gpt-5-mini (nano tier) | $0 free | 2 (rate-limit safe) | Bulk summarization, zero-cost default |
gemini | gemini-3-pro-preview | $0 free | 5 | Large-context batches, higher throughput |
claude | Haiku / Sonnet | Low-Medium | 3 | Higher quality summaries, not free |
Provide a job file describing the summarization task and the gap file from inventory.py --missing.
See SKILL.md for full swarm configuration options.
| Good | Bad |
|---|---|
| "Atomic vault CRUD using POSIX rename + flock, preserving YAML frontmatter via ruamel.yaml." | "This file handles file operations." |
| "3-phase search skill: RLM ledger -> ChromaDB -> grep, escalating from O(1) to exact match." | "Searches for things in the codebase." |
distiller.py -- it calls Ollama, which is slow and may not be running.*_cache.json directly -- always use inject_summary.py (uses fcntl.flock).