Intelligent model selection for Claude Code — decision matrices, cost tables, budget planning, and subagent model assignment for optimal cost/quality tradeoffs
From claude-code-expertnpx claudepluginhub markus41/claude --plugin claude-code-expertThis skill is limited to using the following tools:
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Implements structured self-debugging workflow for AI agent failures: capture errors, diagnose patterns like loops or context overflow, apply contained recoveries, and generate introspection reports.
Select the right Claude model for each task to optimize the cost/quality tradeoff.
Eliminate wasted spend by routing tasks to the cheapest model that produces acceptable quality, while ensuring complex tasks get the reasoning depth they need.
| Task Type | Recommended Model | Reasoning |
|---|---|---|
| Architecture decisions | Opus 4.6 | Needs deep multi-step reasoning, hidden coupling detection |
| Complex debugging | Opus 4.6 | Root cause analysis requires holding many hypotheses |
| Security review | Opus 4.6 | Must not miss subtle vulnerabilities |
| Standard implementation | Sonnet 4.6 | Best balance of speed, quality, and cost for code generation |
| Code review | Sonnet 4.6 | Good pattern recognition at reasonable cost |
| Refactoring | Sonnet 4.6 | Mechanical transformations with quality checks |
| Test writing | Sonnet 4.6 | Formulaic but needs understanding of code under test |
| File search / grep | Haiku 4.5 | Simple lookup, no deep reasoning needed |
| Documentation lookup | Haiku 4.5 | Reading and summarizing existing content |
| Commit message generation | Haiku 4.5 | Short, formulaic output |
| Simple Q&A | Haiku 4.5 | Direct answers, no complex analysis |
| Research subagents | Haiku 4.5 | Exploration tasks that return summaries |
Use these signals to decide when to escalate from Sonnet to Opus:
Use these signals to downgrade from Sonnet to Haiku:
| Model | Input | Output | Cache Write | Cache Read |
|---|---|---|---|---|
| Opus 4.6 | $15.00 | $75.00 | $18.75 | $1.50 |
| Sonnet 4.6 | $3.00 | $15.00 | $3.75 | $0.30 |
| Haiku 4.5 | $0.80 | $4.00 | $1.00 | $0.08 |
| Comparison | Input | Output |
|---|---|---|
| Opus vs Sonnet | 5x | 5x |
| Sonnet vs Haiku | 3.75x | 3.75x |
| Opus vs Haiku | 18.75x | 18.75x |
| Task | Model | Est. Tokens (in/out) | Est. Cost |
|---|---|---|---|
| Simple bug fix | Sonnet | 50k/10k | ~$0.30 |
| Feature implementation | Sonnet | 200k/50k | ~$1.35 |
| Architecture review | Opus | 200k/30k | ~$5.25 |
| Quick lookup | Haiku | 20k/2k | ~$0.02 |
| Research subagent | Haiku | 80k/10k | ~$0.10 |
| Full code review (council) | Mixed | 500k/100k | ~$3-8 |
When using cc-orchestrate or spawning subagents, assign models by role:
Research agents → Haiku (cheap exploration, summary return)
Implementation agents → Sonnet (code generation quality)
Review/audit agents → Sonnet or Opus (depends on risk)
Architecture agents → Opus (deep reasoning required)
builder agent → Sonnet 4.6 (writes code)
validator agent → Sonnet 4.6 (reviews code)
researcher agents (3x) → Haiku 4.5 (parallel exploration)
synthesizer agent → Sonnet 4.6 (combines findings)
Before starting a task, estimate cost:
/model or claude -m| Content Type | Tokens per Line |
|---|---|
| TypeScript/JavaScript | ~10 |
| Python | ~8 |
| JSON/YAML | ~6 |
| Markdown | ~5 |
| Minified code | ~15 |
cat-ing entire large files; use Grep with limits--max-turns in headless mode to cap automated sessions# Start with research on Haiku
/model claude-haiku-4-5-20251001
# "Find all files related to auth, summarize the architecture"
# Switch to Sonnet for implementation
/model claude-sonnet-4-6
# "Implement the new auth middleware based on the research above"
# Switch to Opus for the tricky part
/model claude-opus-4-6
# "Review the session handling for race conditions and edge cases"
CLAUDE_MODEL=claude-sonnet-4-6 # Default model for sessions
ANTHROPIC_MODEL=claude-sonnet-4-6 # Alternative env var
{
"model": "claude-sonnet-4-6",
"smallFastModel": "claude-haiku-4-5-20251001"
}
The smallFastModel is used for internal operations like skill matching and context compression. Keep it on Haiku for cost efficiency.