Help us improve
Share bugs, ideas, or general feedback.
From claude-code-expert
Routes Claude Code tasks to optimal models (Haiku, Sonnet, Opus) using decision matrices, cost tables, complexity signals, and subagent assignments for cost/quality tradeoffs.
npx claudepluginhub markus41/claude --plugin claude-code-expertHow this skill is triggered — by the user, by Claude, or both
Slash command
/claude-code-expert:model-routingThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Select the right Claude model for each task to optimize the cost/quality tradeoff.
Optimizes Claude Code costs: track tokens and USD with /cost, route models (Haiku/Sonnet/Opus), reduce via /compact/grep/sub-agents, maximize prompt caching.
Routes coding tasks to optimal AI model tier by complexity: no LLM for mechanical edits, Haiku for simple refactors, Sonnet for multi-file bugs, Opus for architecture/security. Saves 50-65% API costs.
Recommends Claude models (Haiku for exploration, Sonnet for implementation, Opus for decisions) via routing matrix for task types, subagents, and cost-quality tradeoffs.
Share bugs, ideas, or general feedback.
Select the right Claude model for each task to optimize the cost/quality tradeoff.
Eliminate wasted spend by routing tasks to the cheapest model that produces acceptable quality, while ensuring complex tasks get the reasoning depth they need.
| Task Type | Recommended Model | Reasoning |
|---|---|---|
| Architecture decisions | Opus 4.6 | Needs deep multi-step reasoning, hidden coupling detection |
| Complex debugging | Opus 4.6 | Root cause analysis requires holding many hypotheses |
| Security review | Opus 4.6 | Must not miss subtle vulnerabilities |
| Standard implementation | Sonnet 4.6 | Best balance of speed, quality, and cost for code generation |
| Code review | Sonnet 4.6 | Good pattern recognition at reasonable cost |
| Refactoring | Sonnet 4.6 | Mechanical transformations with quality checks |
| Test writing | Sonnet 4.6 | Formulaic but needs understanding of code under test |
| File search / grep | Haiku 4.5 | Simple lookup, no deep reasoning needed |
| Documentation lookup | Haiku 4.5 | Reading and summarizing existing content |
| Commit message generation | Haiku 4.5 | Short, formulaic output |
| Simple Q&A | Haiku 4.5 | Direct answers, no complex analysis |
| Research subagents | Haiku 4.5 | Exploration tasks that return summaries |
Use these signals to decide when to escalate from Sonnet to Opus:
Use these signals to downgrade from Sonnet to Haiku:
| Model | Input | Output | Cache Write | Cache Read |
|---|---|---|---|---|
| Opus 4.6 | $15.00 | $75.00 | $18.75 | $1.50 |
| Sonnet 4.6 | $3.00 | $15.00 | $3.75 | $0.30 |
| Haiku 4.5 | $0.80 | $4.00 | $1.00 | $0.08 |
| Comparison | Input | Output |
|---|---|---|
| Opus vs Sonnet | 5x | 5x |
| Sonnet vs Haiku | 3.75x | 3.75x |
| Opus vs Haiku | 18.75x | 18.75x |
| Task | Model | Est. Tokens (in/out) | Est. Cost |
|---|---|---|---|
| Simple bug fix | Sonnet | 50k/10k | ~$0.30 |
| Feature implementation | Sonnet | 200k/50k | ~$1.35 |
| Architecture review | Opus | 200k/30k | ~$5.25 |
| Quick lookup | Haiku | 20k/2k | ~$0.02 |
| Research subagent | Haiku | 80k/10k | ~$0.10 |
| Full code review (council) | Mixed | 500k/100k | ~$3-8 |
When using cc-orchestrate or spawning subagents, assign models by role:
Research agents → Haiku (cheap exploration, summary return)
Implementation agents → Sonnet (code generation quality)
Review/audit agents → Sonnet or Opus (depends on risk)
Architecture agents → Opus (deep reasoning required)
builder agent → Sonnet 4.6 (writes code)
validator agent → Sonnet 4.6 (reviews code)
researcher agents (3x) → Haiku 4.5 (parallel exploration)
synthesizer agent → Sonnet 4.6 (combines findings)
Before starting a task, estimate cost:
/model or claude -m| Content Type | Tokens per Line |
|---|---|
| TypeScript/JavaScript | ~10 |
| Python | ~8 |
| JSON/YAML | ~6 |
| Markdown | ~5 |
| Minified code | ~15 |
cat-ing entire large files; use Grep with limits--max-turns in headless mode to cap automated sessions# Start with research on Haiku
/model claude-haiku-4-5-20251001
# "Find all files related to auth, summarize the architecture"
# Switch to Sonnet for implementation
/model claude-sonnet-4-6
# "Implement the new auth middleware based on the research above"
# Switch to Opus for the tricky part
/model claude-opus-4-6
# "Review the session handling for race conditions and edge cases"
CLAUDE_MODEL=claude-sonnet-4-6 # Default model for sessions
ANTHROPIC_MODEL=claude-sonnet-4-6 # Alternative env var
{
"model": "claude-sonnet-4-6",
"smallFastModel": "claude-haiku-4-5-20251001"
}
The smallFastModel is used for internal operations like skill matching and context compression. Keep it on Haiku for cost efficiency.