Plugin

llm-router

Name: llm-router
Author: ypollak2

Route AI tasks—code generation, research, analysis, writing—to cheapest capable LLMs across 20+ providers by auto-classifying type (research, generate, code) and complexity via heuristics or cheap APIs. Save on Claude/OpenAI costs with tracking, alerts, dashboards; decompose complex tasks via agent; automate llm-router releases with tests and PyPI/GitHub publishing.

npx claudepluginhub ypollak2/llm-router --plugin llm-router

Component Overview

Agents

Skills

MCP Servers

Component Details

Agents (1)

LLM Orchestrator Agent

/llm-orchestrator

You are an autonomous multi-LLM orchestration agent. Your job is to analyze complex tasks, decompose them into subtasks, and route each subtask to the optimal LLM using the llm-router MCP tools.

Skills (4)

/release — llm-router Release Skill

/release

Automates the full release pipeline for llm-router. Run this skill whenever

route

/route

Route a task to the best LLM based on task type and complexity

LLM Router — Smart Routing Skill

/routing

Route tasks to the cheapest capable model automatically using llm-router MCP tools.

LLM Router — Savings Tracking Skill

/savings

Track and report how much you've saved by routing tasks to cheaper models.

MCP Servers (1)

Connects to external services

llm-router

README

LLM Router

Route every AI call to the cheapest model that can do the job well.

48 MCP tools · 20+ LLM providers · intelligent routing · personal memory · budget tracking · decision analytics.

Result: 60–80% cost reduction vs running everything on Claude Opus.

The Problem & Solution
Real-World Savings
Quick Start
Key Features
How It Works
Supported Tools
Configuration
Monitoring & Optimization
Combined Download Statistics
MCP Tools Reference
Architecture & Development

The Problem & Solution

The Problem

Traditional AI-assisted development routes every task to your most capable (and expensive) model. A simple file lookup costs the same as complex architecture redesign—burning through your quota and budget on low-value work.

The Solution

llm-router analyzes each task and routes it to the cheapest model that can handle it well:

Simple lookups → Ollama/Haiku (free/cheap)
Moderate coding → Gemini Pro/GPT-4o (budget-friendly)
Complex reasoning → Claude Opus/o3 (premium, only when needed)

The magic: You keep the same conversational experience. No manual routing, no model picking. It just works behind the scenes.

Real-World Savings

Proven Results from 14-Day Sprint

Real numbers: 51 releases, 22.6M tokens, $6.95 spent (vs $50–60 with traditional Opus-everywhere approach).

Cost Breakdown

Actual spend: $6.95 (22.6M development tokens)
Opus baseline: $50–60 (traditional approach)
Savings: $43–53 per 2 weeks (87% reduction)
Annualized: ~$180/year vs $1,200–1,500 baseline

Token Distribution (Free-First Routing)

31% from free models    → 7.0M tokens, $0 cost (Ollama + Codex)
38% from budget models  → 8.6M tokens, $2.82 cost (Gemini Flash + GPT-4o-mini)
31% from premium models → 7.0M tokens, $4.13 cost (GPT-4o, Pro, Claude)

Quota Pressure Elimination

Free-first routing eliminated budget pressure over the sprint—enabling sustainable feature velocity without cost anxiety.

Quota Pressure Trajectory Cost Breakdown Token Distribution Tiers

Quick Start

1. Install

# One command to install and configure
pipx install llm-routing && llm-router install

2. (Optional) Add Provider Keys

# Available in .env or ~/.llm-router/config.yaml
export OPENAI_API_KEY="sk-..."    # For GPT-4o, o3 (optional)
export GEMINI_API_KEY="AIza..."   # For Gemini models (free tier available)
export OLLAMA_BASE_URL="..."      # For local Ollama (optional, auto-starts)

3. Done

Start using Claude Code, Gemini CLI, Codex, VS Code, Cursor, or any MCP-compatible editor. llm-router handles everything automatically.

Key Features

🎯 Intelligent Routing

Complexity Classification — Analyzes prompts to determine if task is simple/moderate/complex
Provider Fallback Chains — Ollama → Codex → GPT-4o → Claude (free-first, always)
Budget Pressure Awareness — Automatically downgrades model when quota is limited
Quality Monitoring — Demotes models with degraded performance via judge scoring
Decision Logging — Track which model was selected, why, and the cost impact

View full README on GitHub

Similar Plugins

openrouter-pack

1.9k

Flagship+ skill pack for OpenRouter - 30 skills for multi-model routing, fallbacks, and LLM gateway mastery

v1.0.0

Stats

Version7.6.2

Stars20

Forks3

MaintenanceExcellent

LicenseMIT

Last CommitApr 28, 2026

AddedApr 16, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

ypollak224

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools

Help us improve

Share bugs, ideas, or general feedback.

Back to Plugins

llm-router

Component Overview

Component Details

Agents (1)

Skills (4)

MCP Servers (1)

README

Table of Contents

The Problem & Solution

The Problem

The Solution

Real-World Savings

Proven Results from 14-Day Sprint

Cost Breakdown

Token Distribution (Free-First Routing)

Quota Pressure Elimination

Quick Start

1. Install

2. (Optional) Add Provider Keys

3. Done

Key Features

🎯 Intelligent Routing

Similar Plugins

openrouter-pack

Help us improve

Help us improve

llm-router

Component Overview

Component Details

Agents (1)

Skills (4)

MCP Servers (1)

README

Table of Contents

The Problem & Solution

The Problem

The Solution

Real-World Savings

Proven Results from 14-Day Sprint

Cost Breakdown

Token Distribution (Free-First Routing)

Quota Pressure Elimination

Quick Start

1. Install

2. (Optional) Add Provider Keys

3. Done

Key Features

🎯 Intelligent Routing

Similar Plugins

openrouter-pack

Help us improve

litellm

claude-router

claude-council

caveman

ui-design