Manage Claude Code API costs - token strategies, model selection, monitoring. Use when concerned about API spend, optimizing token usage, choosing models for tasks, or setting up cost monitoring. Covers /cost command, batching strategies, and budget management.
Manages Claude Code API costs through token optimization, model selection, and budget monitoring.
/plugin marketplace add hgeldenhuys/claude-code-sdk/plugin install hgeldenhuys-session-naming@hgeldenhuys/claude-code-sdkThis skill inherits all available tools. When active, it can use any tool Claude has access to.
MODEL-SELECTION.mdMONITORING.mdTOKEN-STRATEGIES.mdReduce Claude Code API costs while maintaining quality through smart token management, model selection, and monitoring.
| Strategy | Impact | Effort |
|---|---|---|
| Use Haiku for simple tasks | High | Low |
| Batch related operations | Medium | Low |
| Use /compact strategically | Medium | Low |
| Reduce context size | High | Medium |
| Efficient prompting | Medium | Medium |
Claude Code costs are based on tokens:
| Model | Input Cost | Output Cost | Best For |
|---|---|---|---|
| Haiku | $ | $ | Simple tasks, exploration |
| Sonnet | $$ | $$ | General development (default) |
| Opus | $$$$$ | $$$$$ | Complex reasoning, architecture |
Rule of thumb: Opus is ~15x more expensive than Haiku for the same tokens.
| Activity | Token Impact | Optimization |
|---|---|---|
| Reading files | High | Read selectively, use grep |
| Long conversations | Cumulative | Use /compact regularly |
| Tool outputs | Variable | Request summaries |
| Code generation | Medium | Be specific in requests |
| Error messages | Low | N/A |
> /cost
Shows:
| Metric | Good | Concern | Action |
|---|---|---|---|
| Context usage | <50% | >70% | Consider /compact |
| Session cost | Varies | Unexpected spike | Review recent operations |
| Output ratio | Balanced | Output >> Input | Responses too verbose |
View usage statistics over time:
> /stats
Date Range Filtering (2.1.6+): Press r to cycle between:
Shows:
When you have many MCP tools configured, their descriptions can consume significant context space. Version 2.1.7 introduces automatic MCP tool deferral:
MCPSearch instead of loaded upfront| MCP Tools | Without Auto Mode | With Auto Mode | Savings |
|---|---|---|---|
| 10-20 tools | ~2-5% context | ~1% context | 50-80% |
| 50+ tools | ~10-20% context | ~1% context | 90%+ |
If you need all MCP tools loaded upfront (e.g., for specific workflows):
// settings.json
{
"disallowedTools": ["MCPSearch"]
}
Note: Only disable if you have few MCP tools or specifically need immediate tool availability.
Expensive:
> Read the entire src/ directory to understand the codebase
Efficient:
> @src/api/users.ts @src/types/user.ts - I need to modify the user API
Expensive:
> Find all files that use the AuthService class
[Claude reads many files to find them]
Efficient:
> grep for "AuthService" in src/, then I'll look at the most relevant ones
| Pattern | Token Cost | Use Case |
|---|---|---|
@src/ | Very High | Avoid unless necessary |
@src/api/ | High | When exploring a module |
@src/api/users.ts | Low | Specific file work |
@src/api/users.ts:50-100 | Very Low | Specific section |
> Analyze this file and give me a brief summary of the key functions
vs
> Explain every line of this file
Expensive (multiple turns):
> Read file A
> Now modify line 10
> Now read file B
> Modify line 20
Efficient (single turn):
> In file A, update the getUserById function to handle null.
> In file B, add the new UserNotFound error type.
> Run the tests after both changes.
Use /compact when:
Cost impact: Reduces ongoing costs by 50-80%
Use /clear when:
Cost impact: Resets to zero (but loses all context)
| Situation | Command | Reasoning |
|---|---|---|
| Same task, full context | /compact | Preserve progress |
| Different project | /clear | Irrelevant context |
| Stuck on approach | /clear | Fresh perspective |
| After major milestone | /compact | Keep decisions |
| Testing something new | /clear | Clean state |
| Task Type | Recommended Model | Why |
|---|---|---|
| File exploration | Haiku | Fast, cheap, sufficient |
| Simple edits | Haiku | Straightforward |
| General coding | Sonnet | Balanced (default) |
| Bug fixing | Sonnet | Needs reasoning |
| Architecture design | Opus | Deep analysis |
| Security review | Opus | Critical thinking |
| Complex refactoring | Opus | Multi-file reasoning |
Set model in skill frontmatter:
---
model: haiku
---
Or request model in prompt:
> Using Haiku, list all TypeScript files in src/
Task: Review 10 files for security issues
| Approach | Estimated Cost |
|---|---|
| Opus reviews all | $$$$$ |
| Haiku scans, Opus reviews flagged | $$ |
| Sonnet reviews all | $$$ |
Best strategy: Use Haiku for initial scan, escalate to Opus for detailed review of potential issues.
| Verbose | Concise | Savings |
|---|---|---|
| "Could you please" | [Just ask] | 3-4 tokens |
| "I want you to" | [State task] | 4-5 tokens |
| Long explanations | Bullet points | 20-50% |
| Repeated context | @ mentions | Significant |
Token-heavy:
> I have this function that gets users from the database and I want
> to add some caching because it's being called too often and making
> the app slow. Can you help me figure out a good caching strategy?
Efficient:
> Add Redis caching to getUserById in @src/api/users.ts.
> TTL: 5 minutes. Invalidate on user update.
> Implement user search:
> - [ ] Add search endpoint
> - [ ] Add debounced input
> - [ ] Handle empty results
> Run tests when done.
Clearer than long paragraph descriptions.
Instead of multiple turns:
> Add logging to function A
[response]
> Add logging to function B
[response]
> Add logging to function C
Single turn:
> Add consistent logging to functions A, B, and C in @src/utils.ts
> Use format: logger.info("[FunctionName] action", { params })
> Review @src/api/*.ts for missing error handling.
> Add try-catch with proper logging to any functions that need it.
> Summarize changes made.
| Session Type | Typical Cost Range |
|---|---|
| Quick fix | $ |
| Feature implementation | $$-$$$ |
| Large refactor | $$$-$$$$ |
| Architecture session (Opus) | $$$$$ |
> /cost
[Note the total]
Track across sessions to understand your patterns.
Subagents have isolated context:
---
name: explorer
model: haiku
tools: Read, Glob, Grep
---
Explore and summarize. Return only key findings.
| Task | Agent Model | Return |
|---|---|---|
| Find all API routes | Haiku | Route list |
| Analyze dependencies | Haiku | Summary |
| Review for patterns | Sonnet | Findings |
| Deep security review | Opus | Detailed report |
| Pattern | Why Wasteful | Better Approach |
|---|---|---|
| Reading entire directories | Massive token cost | Grep first, read specific |
| Verbose explanations | Unnecessary output | Request concise |
| Repeating context | Already in history | Use @ mentions |
| Not using /compact | Growing costs | Compact at 70% |
| Opus for everything | Expensive overkill | Match model to task |
| Long debugging sessions | Cumulative cost | Clear and restart |
| File | Contents |
|---|---|
| TOKEN-STRATEGIES.md | Detailed token reduction techniques |
| MODEL-SELECTION.md | Model comparison and selection guide |
| MONITORING.md | Cost tracking and budget management |
| Situation | Action |
|---|---|
| Context at 70% | /compact |
| Simple file exploration | Use Haiku |
| Need deep analysis | Use Opus (worth the cost) |
| Unexpected high cost | Check recent operations |
| Switching tasks | /clear to save costs |
| Debugging loop | Clear and try fresh approach |
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.