By emasoft
MCP server that offloads bounded LLM tasks from Claude Code to cheaper local (LM Studio, Ollama, vLLM, llama.cpp) or remote (OpenRouter) models. Profile-based configuration with ensemble mode.
npx claudepluginhub emasoft/emasoft-plugins --plugin llm-externalizerThis plugin requires configuration values that are prompted when the plugin is enabled. Sensitive values are stored in your system keychain.
openrouter_api_keyOpenRouter API key for remote and ensemble modes (https://openrouter.ai/keys). Stored in system keychain. If blank, falls back to $OPENROUTER_API_KEY from your shell environment — leave blank to keep existing shell-based setup.
${user_config.openrouter_api_key}Benchmark OpenRouter programming-category models against a TypeScript classification task. Filters by cost + capability, scores each candidate against 71 fixture functions + 3 literal keywords, writes a markdown comparison report. Use this to pick the cheapest model that still passes the real workload.
Interactively pick a new 3-model OpenRouter ensemble for the active profile. Runs the benchmark (or reuses a cached one), presents a menu per slot (first / second / third), shows the new ensemble's cost vs the last accepted snapshot, and on confirmation atomically updates ~/.llm-externalizer/settings.yaml. Never touches the profile's mode (local/remote/remote-ensemble) — that stays as the user configured it.
Inspect LLM Externalizer profile configuration. Read-only — model & profile changes are user-only via manual YAML editing.
Check LLM Externalizer health, active profile, model, auth status, and context window
Aggregate unfixed findings across every report in `./reports/llm-externalizer/` and fix each via a fresh serial-fixer subagent (sonnet/opus menu). Optional `@merged-report.md` scopes the loop to one report.
Fix findings in ONE existing per-file scan report. Pick sonnet or opus via menu, dispatches a single parallel-fixer subagent, returns its `.fixer.`-summary path. For whole-folder audits use `/llm-externalizer:llm-externalizer-scan-and-fix`.
Install the LLM Externalizer multi-tier Claude Code statusline (model + context bar + MCP tokens/cost + OpenRouter credits + 5h/7d limits, with width-aware tiering and per-section error isolation).
Predict the cost / time / cap-skipped numbers for a fieldset against the registered files. Honors --budget-usd as a hard gate. Phase 3 of the pipeline.
Dump every result row of a mass-scouting job to JSONL or CSV under reports/mass_scouting/. Useful for follow-up analysis in pandas, jq, etc.
Print one file row from the mass-scouting registry by short_id. Optionally include the result row for a specific job_id.
Run the cheap script-only file classifier across registered files. Assigns each file a bucket (binary / sourcecode / documentation / config / log / rules_to_eval / has_frontmatter / unknown). Phase 2 of the pipeline.
Register a folder (or explicit file list) into the mass-scouting SQLite registry. Phase 1 of the mass-scouting pipeline.
Cross-job federated search across multiple mass-scouting jobs. Same query semantics as mass-scout-search; results are tagged with the originating job_id and merged by bm25 rank.
Per-job search across mass-scouting results. Three modes auto-routed: regex (for trivial queries — emails, urls, ipv4, etc.), FTS5 keyword search, and structured JSON1 path filters. Phase 5 of the pipeline.
Run the LLM scout end-to-end on every eligible file. Compiles the fieldset to a JSON Schema, fans calls out via the worker pool, repairs + validates each response, persists to SQLite, writes a markdown report. Phase 4 of the pipeline.
Scan a codebase, aggregate findings into one canonical bug list, then fix each bug serially with a sonnet- or opus-model serial-fixer subagent. Use when fixes mutate shared state or bug order matters.
Two-stage codebase audit. LLM Externalizer scan produces one report per file; parallel sonnet- or opus-model fixer subagents (≤15 concurrent) verify and fix each finding. Orchestrator never reads scan or fixer content — only report paths.
Scan a codebase (same language as the input files) for an existing implementation of a described feature. FFD-batched ensemble calls, exhaustive per-file output. Works for PR duplicate-check and greenfield audits.
Opus-model variant. Verify and fix ONE LLM Externalizer per-file bug report. Input is a single absolute path to a report `.md`. Validates findings, applies minimal fixes only to REAL bugs, runs linters, writes a `.fixer.`-tagged summary, returns the summary path. Dispatched in parallel by `llm-externalizer-scan-and-fix` when the user picks "opus" on the model-menu prompt.
Sonnet-model variant. Verify and fix ONE LLM Externalizer per-file bug report. Input is a single absolute path to a report `.md`. Validates findings, applies minimal fixes only to REAL bugs, runs linters, writes a `.fixer.`-tagged summary, returns the summary path. Dispatched in parallel by `llm-externalizer-scan-and-fix` when the user picks "sonnet" on the model-menu prompt.
Use for a fast code review from the LLM Externalizer ensemble without loading scan output into the main context. Accepts a file/folder/glob and returns only report paths. Trigger with "review this file", "llm-ext review", "audit these files", "scan for bugs".
Opus-model variant. Fix exactly ONE bug from a markdown bug list produced by llm-externalizer-fix-found-bugs. Reads the bug-file absolute path, picks the highest-severity unfixed entry, applies a minimal surgical fix, updates the bug file with a ` — FIXED` marker plus a short post-mortem, returns a single-line summary. Dispatched per-bug when the user picks "opus" on the model-menu prompt.
Sonnet-model variant. Fix exactly ONE bug from a markdown bug list produced by llm-externalizer-fix-found-bugs. Reads the bug-file absolute path, picks the highest-severity unfixed entry, applies a minimal surgical fix, updates the bug file with a ` — FIXED` marker plus a short post-mortem, returns a single-line summary. Dispatched per-bug when the user picks "sonnet" on the model-menu prompt.
Use when inspecting LLM Externalizer profile config or explaining the manual-edit policy. Trigger with "show LLM profile", "which model is active", "edit settings.yaml".
Use when scanning a project for free using the Nemotron model (no cost, lower quality). Trigger with "free scan", "free-scan", "scan for free", "quick scan", "cheap scan", "scan without cost", "nemotron scan".
Use when extracting the SAME structured metadata from many files with a cheap LLM. Trigger with "mass scout", "scan many files for X", "extract structured data from a folder", "classify all my files", "audit thousands of files", "run a fieldset over a codebase", "audit my plugin", "PR review all changed files", "security-scan this repo".
Use when asking for OpenRouter model details — supported params, pricing, latency, uptime, quantization. Trigger with "openrouter model info", "or-model-info", "what params does X support", "show pricing for", "check model support".
Use when scanning an entire project or codebase for bugs, security issues, or code quality problems. Trigger with "scan project", "audit codebase", "scan codebase", "full scan", "run project scan", "check whole project", "scan all files".
Use when offloading file analysis to external LLMs. Trigger with "analyze files", "scan folder", "check imports", "compare files", "batch check".
When calling LLM APIs from Python code. When connecting to llamafile or local LLM servers. When switching between OpenAI/Anthropic/local providers. When implementing retry/fallback logic for LLM calls. When code imports litellm or uses completion() patterns.
Intelligent delegation framework for routing tasks to external LLM services while retaining strategic oversight
Run AI models locally with Ollama - free alternative to OpenAI, Anthropic, and other paid LLM APIs. Zero-cost, privacy-first AI infrastructure.
Code review, compare, and debate tools using multiple AI models
External LLM integration tools for Claude Code. Get second opinions from Codex (OpenAI) and Gemini (Google) on architecture, design, and code review.
Admin access level
Server config contains admin-level keywords
Uses power tools
Share bugs, ideas, or general feedback.
Smart LLM routing with Claude subscription monitoring, complexity-first model selection, and 20+ AI providers
Uses Bash, Write, or Edit tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claim