Skill

model-routing

Routes Claude Code tasks to optimal models (Haiku, Sonnet, Opus) using decision matrices, cost tables, complexity signals, and subagent assignments for cost/quality tradeoffs.

Anthropic

ai-ml

developer-tools

npx claudepluginhub markus41/claude --plugin claude-code-expert

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claude-code-expert:model-routing

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadGrepGlobBash

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Select the right Claude model for each task to optimize the cost/quality tradeoff.

SKILL.md

191 lines · ~1.6k tokens

Similar Skills

Claude Code Cost Optimization

Optimizes Claude Code costs: track tokens and USD with /cost, route models (Haiku/Sonnet/Opus), reduce via /compact/grep/sub-agents, maximize prompt caching.

claude-code-expert

vector

Routes coding tasks to optimal AI model tier by complexity: no LLM for mechanical edits, Haiku for simple refactors, Sonnet for multi-file bugs, Opus for architecture/security. Saves 50-65% API costs.

claude-code-superpowers

model

Recommends Claude models (Haiku for exploration, Sonnet for implementation, Opus for decisions) via routing matrix for task types, subagents, and cost-quality tradeoffs.

1 file3 tools

cc-best

Stats

LanguageTypeScript

Parent stars12

Parent forks1

MaintenanceExcellent

Last CommitMar 31, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Model Routing Intelligence

Select the right Claude model for each task to optimize the cost/quality tradeoff.

Goal

Eliminate wasted spend by routing tasks to the cheapest model that produces acceptable quality, while ensuring complex tasks get the reasoning depth they need.

Decision Matrix

Task → Model mapping

Task Type	Recommended Model	Reasoning
Architecture decisions	Opus 4.6	Needs deep multi-step reasoning, hidden coupling detection
Complex debugging	Opus 4.6	Root cause analysis requires holding many hypotheses
Security review	Opus 4.6	Must not miss subtle vulnerabilities
Standard implementation	Sonnet 4.6	Best balance of speed, quality, and cost for code generation
Code review	Sonnet 4.6	Good pattern recognition at reasonable cost
Refactoring	Sonnet 4.6	Mechanical transformations with quality checks
Test writing	Sonnet 4.6	Formulaic but needs understanding of code under test
File search / grep	Haiku 4.5	Simple lookup, no deep reasoning needed
Documentation lookup	Haiku 4.5	Reading and summarizing existing content
Commit message generation	Haiku 4.5	Short, formulaic output
Simple Q&A	Haiku 4.5	Direct answers, no complex analysis
Research subagents	Haiku 4.5	Exploration tasks that return summaries

Complexity signals

Use these signals to decide when to escalate from Sonnet to Opus:

Multiple interacting systems or modules
Non-obvious failure modes
"Why does this work?" questions
Tasks where a wrong answer is expensive to fix
Cross-cutting concerns (auth, caching, observability)
Migration or backward-compatibility requirements

Use these signals to downgrade from Sonnet to Haiku:

Single-file changes
Mechanical transformations (rename, reformat)
Reading and summarizing (no generation)
Answering factual questions about code

Cost Tables

Per-token pricing (USD per million tokens)

Model	Input	Output	Cache Write	Cache Read
Opus 4.6	$15.00	$75.00	$18.75	$1.50
Sonnet 4.6	$3.00	$15.00	$3.75	$0.30
Haiku 4.5	$0.80	$4.00	$1.00	$0.08

Cost multipliers

Comparison	Input	Output
Opus vs Sonnet	5x	5x
Sonnet vs Haiku	3.75x	3.75x
Opus vs Haiku	18.75x	18.75x

Typical session costs

Task	Model	Est. Tokens (in/out)	Est. Cost
Simple bug fix	Sonnet	50k/10k	~$0.30
Feature implementation	Sonnet	200k/50k	~$1.35
Architecture review	Opus	200k/30k	~$5.25
Quick lookup	Haiku	20k/2k	~$0.02
Research subagent	Haiku	80k/10k	~$0.10
Full code review (council)	Mixed	500k/100k	~$3-8

Subagent Model Assignment

Orchestration patterns

When using cc-orchestrate or spawning subagents, assign models by role:

Research agents     → Haiku (cheap exploration, summary return)
Implementation agents → Sonnet (code generation quality)
Review/audit agents → Sonnet or Opus (depends on risk)
Architecture agents → Opus (deep reasoning required)

Example: builder-validator template

builder agent   → Sonnet 4.6 (writes code)
validator agent → Sonnet 4.6 (reviews code)

Example: research-council template

researcher agents (3x) → Haiku 4.5 (parallel exploration)
synthesizer agent      → Sonnet 4.6 (combines findings)

Budget Planning

Setting a session budget

Before starting a task, estimate cost:

Classify the task using the decision matrix above
Estimate token volume based on file count and task scope
Calculate cost using the pricing table
Set model with /model or claude -m

Token estimation rules of thumb

Content Type	Tokens per Line
TypeScript/JavaScript	~10
Python	~8
JSON/YAML	~6
Markdown	~5
Minified code	~15

Cost control techniques

Start with Haiku for research, switch to Sonnet for implementation
Use subagents to isolate expensive research from main context
Compact early at 60-70% context to avoid expensive re-reads
Limit tool output — avoid cat-ing entire large files; use Grep with limits
Batch related tasks to benefit from prompt caching (cache read = 10% of input cost)
Use --max-turns in headless mode to cap automated sessions

Model switching workflow

# Start with research on Haiku
/model claude-haiku-4-5-20251001
# "Find all files related to auth, summarize the architecture"

# Switch to Sonnet for implementation
/model claude-sonnet-4-6
# "Implement the new auth middleware based on the research above"

# Switch to Opus for the tricky part
/model claude-opus-4-6
# "Review the session handling for race conditions and edge cases"

Environment Variables

CLAUDE_MODEL=claude-sonnet-4-6          # Default model for sessions
ANTHROPIC_MODEL=claude-sonnet-4-6       # Alternative env var

Settings Configuration

{
  "model": "claude-sonnet-4-6",
  "smallFastModel": "claude-haiku-4-5-20251001"
}

The smallFastModel is used for internal operations like skill matching and context compression. Keep it on Haiku for cost efficiency.

Anti-patterns

Using Opus for everything — 5x the cost of Sonnet with marginal quality improvement on simple tasks
Using Haiku for complex implementation — saves money but produces lower-quality code that needs more iterations
Not using subagents — research in main context inflates token count for every subsequent turn
Re-reading large files — each read costs tokens; anchor important content instead
Ignoring cache hits — restructure prompts to maximize cache read tokens (10% of input cost)

model-routing

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

model-routing

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Model Routing Intelligence

Goal

Decision Matrix

Task → Model mapping

Complexity signals

Cost Tables

Per-token pricing (USD per million tokens)

Cost multipliers

Typical session costs

Subagent Model Assignment

Orchestration patterns

Example: builder-validator template

Example: research-council template

Budget Planning

Setting a session budget

Token estimation rules of thumb

Cost control techniques

Model switching workflow

Environment Variables

Settings Configuration

Anti-patterns

Similar Skills

Help us improve

Model Routing Intelligence

Goal

Decision Matrix

Task → Model mapping

Complexity signals

Cost Tables

Per-token pricing (USD per million tokens)

Cost multipliers

Typical session costs

Subagent Model Assignment

Orchestration patterns

Example: builder-validator template

Example: research-council template

Budget Planning

Setting a session budget

Token estimation rules of thumb

Cost control techniques

Model switching workflow

Environment Variables

Settings Configuration

Anti-patterns