Set up and operate the RLM (Recursive Language Models) Orchestrator for processing arbitrarily large contexts. Handles Rust builds, Ollama/LiteLLM provider configuration, WASM compilation targets, and query workflows. Use when: setting up RLM project, configuring LLM providers for RLM, running queries against large files (10MB+), troubleshooting WASM compilation errors, or analyzing conversation exports.
From evolv3ainpx claudepluginhub evolv3ai/claude-skills-archive --plugin projectThis skill uses the workspace's default tool permissions.
README.mdassets/example-template.txtreferences/LOCAL_LLM_GUIDE.mdreferences/LOCAL_OLLAMA_INSTALLED_MODELS.mdreferences/example-reference.mdscripts/example-script.shGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides root cause investigation for bugs, test failures, unexpected behavior, performance issues, and build failures before proposing fixes.
Guides idea refinement into designs: explores context, asks questions one-by-one, proposes approaches, presents sections for approval, writes/review specs before coding.
Status: Beta Last Updated: 2026-01-25 Dependencies: Rust (via rustup), MSVC Build Tools (Windows), Ollama or DeepSeek API Latest Versions: rlm-orchestrator@0.2.0, rustup@1.28.2, wasmtime@27.0.0
git clone https://github.com/softwarewrighter/rlm-project.git D:/rlm-project
cd D:/rlm-project/rlm-orchestrator
Why this matters:
# Install rustup (NOT scoop rust - need rustup for targets)
winget install Rustlang.Rustup --silent --accept-package-agreements
# Refresh PATH (or restart terminal)
$env:PATH = "$env:USERPROFILE\.cargo\bin;$env:PATH"
# Add WASM target
rustup target add wasm32-unknown-unknown
# Verify
rustc --version
rustup target list --installed | Select-String wasm
CRITICAL:
scoop install rust - it lacks rustup for managing targetsrust_wasm_mapreduce commands# Download and install with C++ workload
winget install Microsoft.VisualStudio.2022.BuildTools
# Then run installer with required components
C:\temp\vs_buildtools.exe --add Microsoft.VisualStudio.Workload.VCTools `
--add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 `
--add Microsoft.VisualStudio.Component.Windows11SDK.22621 `
--quiet --wait
Why this matters:
link.exe failed errorslink.exe is NOT the same as MSVC'scd D:\rlm-project\rlm-orchestrator
cargo build --release
Build takes ~2-3 minutes first time. Outputs:
target/release/rlm-server.exe - HTTP server with visualizertarget/release/rlm.exe - CLI toolCreate config-local.toml for your Ollama setup:
max_iterations = 20
max_sub_calls = 50
output_limit = 10000
bypass_enabled = true
bypass_threshold = 4000
level_priority = ["dsl", "wasm"]
[dsl]
enabled = true
max_regex_matches = 10000
[wasm]
enabled = true
rust_wasm_enabled = true
fuel_limit = 1000000
memory_limit = 67108864
# Code generation via Ollama
codegen_provider = "ollama"
codegen_url = "http://192.168.1.120:11434"
codegen_model = "qwen2.5:14b-instruct-q4_K_M"
# Root LLM (needs 32B+ for reliable JSON)
[[providers]]
provider_type = "ollama"
base_url = "http://192.168.1.120:11434"
model = "qwen2.5:14b-instruct-q4_K_M"
role = "root"
weight = 1
# Sub LLM (can be smaller, handles simple tasks)
[[providers]]
provider_type = "ollama"
base_url = "http://192.168.1.120:11434"
model = "qwen3:1.7b-q4_K_M"
role = "sub"
weight = 1
# Start server
.\target\release\rlm-server.exe config-local.toml
# In another terminal, test health
curl http://localhost:4539/health
# Open visualizer
start http://localhost:4539/visualize
RLM runs on Windows, WSL, Linux, and macOS. Detect your environment:
if [[ "$OS" == "Windows_NT" || -n "$MSYSTEM" ]]; then
echo "Windows (Git Bash)"
CARGO_PATH="$HOME/.cargo/bin"
elif grep -qi microsoft /proc/version 2>/dev/null; then
echo "WSL"
CARGO_PATH="$HOME/.cargo/bin"
else
echo "Linux/macOS"
CARGO_PATH="$HOME/.cargo/bin"
fi
| Platform | Installation Method | Notes |
|---|---|---|
| Windows | winget install Rustlang.Rustup | Requires MSVC Build Tools |
| WSL/Linux | curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh | Standard |
| macOS | curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh | Xcode CLT required |
rustup target add wasm32-unknown-unknown
Enables:
rust_wasm_intent - LLM generates Rust code compiled to WASMrust_wasm_mapreduce - Parallel processing of large contextsSee references/LOCAL_LLM_GUIDE.md for detailed model recommendations.
| Provider | JSON Reliability | Best For | Cost |
|---|---|---|---|
| OpenAI GPT-4o | ✅ Excellent | Production, large files | ~$0.01/query |
| OpenRouter | ✅ Excellent | Multi-model access | Varies |
| DeepSeek API | ✅ Excellent | Cheap + reliable | ~$0.001/query |
| Ollama 70B+ | ⚠️ Good | Privacy, air-gapped | Electricity |
| Ollama 24B-32B | ❌ Unreliable | Sub-calls only | Electricity |
| Ollama 14B | ❌ Very Unreliable | Not recommended for root | Electricity |
Key insight: Local models (14B-24B) struggle with RLM's JSON protocol. Use API providers for root LLM.
Key config sections:
# Limits
max_iterations = 20 # Max RLM loop iterations
max_sub_calls = 50 # Max llm_query sub-calls
output_limit = 10000 # Max chars in command output
# Smart bypass (skip RLM for small contexts)
bypass_enabled = true
bypass_threshold = 4000 # chars (~1000 tokens)
# Feature levels
level_priority = ["dsl", "wasm", "cli", "llm_delegation"]
# Health check
curl http://localhost:4539/health
# Expected: {"status":"healthy","version":"0.2.0","wasm_enabled":true,"rust_wasm_enabled":true}
# Simple query
curl -X POST http://localhost:4539/query \
-H "Content-Type: application/json" \
-d '{"query": "How many lines?", "context": "Line 1\nLine 2\nLine 3"}'
rustup (not scoop/brew rust) for target managementwasm32-unknown-unknown target before building/health endpoint before running queriesreferences/LOCAL_LLM_GUIDE.md for model recommendationsThis skill prevents 6 documented issues:
Error: linking with link.exe failed: exit code: 1
Source: Rust on Windows requires MSVC linker
Why It Happens: Git's link.exe found instead of MSVC's
Prevention: Install VS Build Tools with VCTools workload
Error: error[E0463]: can't find crate for std with note about wasm32-unknown-unknown
Source: WASM compilation requires explicit target
Why It Happens: rustup doesn't include WASM target by default
Prevention: Run rustup target add wasm32-unknown-unknown
Error: Failed to parse JSON command or malformed output
Source: Model too small for RLM protocol
Why It Happens: 7B-14B models can't follow JSON protocol reliably
Prevention: Use 32B+ model for root LLM (or DeepSeek API)
Error: Address already in use on port 4539/8080
Source: Previous server still running
Why It Happens: Didn't stop previous instance
Prevention: pkill -f rlm-server or check netstat -an | grep 4539
Error: rustup: command not found after installing via scoop
Source: Scoop rust package doesn't include rustup
Why It Happens: Scoop provides standalone rustc, not full toolchain
Prevention: Use winget install Rustlang.Rustup instead
Error: thread 'tokio-runtime-worker' panicked... panic in a function that cannot unwind
Source: WASM runtime memory limits exceeded during execution
Why It Happens: WASM fuel/memory limits aren't sufficient for iterating over 70MB+ files
Prevention: Disable WASM for large files (enabled = false), use hybrid Python+RLM workflow
# RLM Orchestrator Configuration - LAN Ollama
max_iterations = 20
max_sub_calls = 50
output_limit = 10000
# Smart bypass for small contexts
bypass_enabled = true
bypass_threshold = 4000
# Feature levels
level_priority = ["dsl", "wasm"]
# DSL Configuration
[dsl]
enabled = true
max_regex_matches = 10000
max_slice_size = 1048576
max_variables = 100
# WASM Configuration
[wasm]
enabled = true
rust_wasm_enabled = true
fuel_limit = 1000000
memory_limit = 67108864
cache_size = 100
codegen_provider = "ollama"
codegen_url = "http://192.168.1.120:11434"
codegen_model = "qwen2.5:14b-instruct-q4_K_M"
# Root LLM - handles RLM orchestration
[[providers]]
provider_type = "ollama"
base_url = "http://192.168.1.120:11434"
model = "qwen2.5:14b-instruct-q4_K_M"
role = "root"
weight = 1
# Sub LLM - handles llm_query calls
[[providers]]
provider_type = "ollama"
base_url = "http://192.168.1.120:11434"
model = "qwen3:1.7b-q4_K_M"
role = "sub"
weight = 1
Why these settings:
bypass_threshold = 4000 - Skip RLM overhead for small contextsfuel_limit = 1000000 - Prevent infinite loops in WASM# Load file content and query
CONTEXT=$(cat /path/to/large.log)
curl -X POST http://localhost:4539/query \
-H "Content-Type: application/json" \
-d "{\"query\": \"Count ERROR lines\", \"context\": $(echo "$CONTEXT" | jq -Rs .)}"
When to use: Log analysis, error counting, pattern finding
curl -X POST http://localhost:4539/debug \
-H "Content-Type: application/json" \
-d '{"query": "...", "context": "..."}'
When to use: Understanding RLM's reasoning, troubleshooting queries
# config-openai.toml
# Set LITELLM_API_KEY=your-openai-key
[[providers]]
provider_type = "litellm"
base_url = "https://api.openai.com/v1"
model = "gpt-4o"
role = "root"
weight = 1
[[providers]]
provider_type = "ollama"
base_url = "http://192.168.1.120:11434"
model = "qwen3:1.7b-q4_K_M"
role = "sub"
weight = 1
When to use: Production workloads, large files, best JSON reliability
# config-openrouter.toml
# Set LITELLM_API_KEY=your-openrouter-key
[[providers]]
provider_type = "litellm"
base_url = "https://openrouter.ai/api/v1"
model = "deepseek/deepseek-chat"
role = "root"
weight = 1
[[providers]]
provider_type = "ollama"
base_url = "http://192.168.1.120:11434"
model = "qwen3:1.7b-q4_K_M"
role = "sub"
weight = 1
When to use: Access to multiple models via single API, cost optimization
# Set DEEPSEEK_API_KEY env var
[[providers]]
provider_type = "deepseek"
model = "deepseek-chat"
role = "root"
[[providers]]
provider_type = "ollama"
base_url = "http://localhost:11434"
model = "qwen2.5-coder:14b"
role = "sub"
When to use: Cheapest reliable option (~$0.001/query)
LOCAL_LLM_GUIDE.md - Comprehensive guide for model selection, hardware configs, performance expectationsLOCAL_OLLAMA_INSTALLED_MODELS.md - Current installed models on LAN Ollama serverWhen Claude should load these:
⚠️ KNOWN ISSUE: WASM crashes on files >70MB due to memory limits during execution.
For files like Claude conversation exports (72MB+), use the hybrid approach:
import json
with open('conversations.json') as f:
data = json.load(f)
print(f"Total: {len(data)} conversations")
print(f"Date range: {min(c['created_at'][:10] for c in data)} to {max(c['created_at'][:10] for c in data)}")
# Extract one conversation, then analyze with RLM
curl -X POST http://localhost:4539/query \
-H "Content-Type: application/json" \
-d '{"query": "Summarize the key decisions", "context": "..."}'
[wasm]
enabled = false
rust_wasm_enabled = false
max_iterations = 50
max_sub_calls = 100
output_limit = 50000
[dsl]
max_slice_size = 10485760 # 10MB
max_variable_size = 10485760 # 10MB
Use API provider (OpenAI/OpenRouter) for reliable JSON
Use DSL-only queries:
For usage tracking and multi-provider fallback:
[[providers]]
provider_type = "litellm"
base_url = "http://localhost:4000"
model = "deepseek/deepseek-chat"
role = "root"
Set LITELLM_MASTER_KEY environment variable for authentication.
Required:
Optional:
{
"rust": "1.93.0",
"rustup": "1.28.2",
"wasmtime": "27.0.0",
"rlm-orchestrator": "0.2.0",
"dependencies": {
"axum": "0.7.9",
"tokio": "1.x",
"reqwest": "0.12.x"
}
}
This skill is based on actual RLM setup on Windows 11:
| Provider | JSON Reliability | Query Success |
|---|---|---|
| Qwen 14B (Ollama) | ❌ Frequent parse errors | Partial |
| Mistral 24B (Ollama) | ❌ Parse errors | Partial |
| GPT-4o (OpenAI) | ✅ Excellent | Yes |
Total conversations: 1,375
Date range: 2023-08-10 to 2025-09-09
HIGH VALUE (50+ msgs): 1
MEDIUM (11-50 msgs): 183
LOW (1-10 msgs): 1,158
Processing: Python instant, RLM+GPT-4o ~3 seconds per query
Solution: Install VS Build Tools with VCTools workload. Ensure MSVC link.exe is in PATH before Git's.
Solution: Run rustup target add wasm32-unknown-unknown
Solution: Use larger model (32B+) or switch to DeepSeek API for root LLM
Solution: Kill existing process: pkill -f rlm-server or use different port in config
Solution: Check Ollama server connectivity. Increase timeout in config. Use smaller model for faster responses.
Use this checklist to verify your setup:
rustup target list --installed | grep wasmcargo build --releasehealthyQuestions? Issues?
references/LOCAL_LLM_GUIDE.md for model selection