Help us improve
Share bugs, ideas, or general feedback.
From claude-code-expert
Tracks tokens, analyzes caching behavior, identifies bottlenecks in tool usage, and estimates costs to optimize Claude Code session performance and efficiency.
npx claudepluginhub markus41/claude --plugin claude-code-expertHow this skill is triggered — by the user, by Claude, or both
Slash command
/claude-code-expert:session-analyticsThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Understand where tokens go, identify waste, and optimize Claude Code sessions for cost and speed.
Track session costs, set budget alerts, and optimize token spend. Use to check costs mid-session or set spending limits.
Optimizes Claude Code costs: track tokens and USD with /cost, route models (Haiku/Sonnet/Opus), reduce via /compact/grep/sub-agents, maximize prompt caching.
Monitors and analyzes AI token consumption across providers, detects waste patterns (verbose output, low cache hit, high reasoning), estimates theoretical costs, and optimizes usage. Inspired by CodeBurn.
Share bugs, ideas, or general feedback.
Understand where tokens go, identify waste, and optimize Claude Code sessions for cost and speed.
Give users the tools to measure, understand, and improve the efficiency of their Claude Code sessions.
Run /cost in any session to see:
Input tokens: 145,230
Output tokens: 28,450
Cache read tokens: 89,100 (cheaper — 10% of input price)
Cache write tokens: 12,400 (25% more than input price)
Total estimated cost: $0.87
| Category | What it represents | Cost relative to input |
|---|---|---|
| Input tokens | New content sent to the model each turn | 1.0x |
| Output tokens | Content the model generates | 5.0x (Opus/Sonnet) |
| Cache read | Content matched from prompt cache | 0.1x |
| Cache write | Content added to prompt cache | 1.25x |
Cache reads are your best friend — they're 10x cheaper than fresh input tokens.
Each turn sends:
The conversation history is the main cost driver. It grows monotonically until /compact.
| Pattern | Symptom | Fix |
|---|---|---|
| Repeated file reads | Same file in tool calls 3+ times | Read once, reference from memory |
| Over-broad Bash output | ls -R or cat on large files | Use Glob/Grep with limits |
| Unnecessary subagent spawning | Subagent for trivial lookup | Direct tool call instead |
| Large tool output | Bash command returns 500+ lines | Pipe through head or tail |
| Context thrashing | /compact then immediately re-read same files | Better anchor planning |
| Wrong model tier | Opus for file search | Switch to Haiku for lookups |
From most to least expensive per call (typical):
Good efficiency indicators:
Claude Code automatically caches the following between turns:
Cache hits occur when the same content prefix appears in consecutive turns. This means:
These actions invalidate the cache:
/compact — rewrites conversation historyEstimate cost using these heuristics:
| Task Type | Model | Typical Turns | Typical Cost |
|---|---|---|---|
| Quick bug fix | Sonnet | 5-10 | $0.10-0.30 |
| Feature implementation | Sonnet | 15-30 | $0.50-2.00 |
| Large refactor | Sonnet | 30-60 | $2.00-5.00 |
| Architecture analysis | Opus | 10-20 | $3.00-8.00 |
| Code review (council) | Mixed | 20-40 | $3.00-10.00 |
| Research task | Haiku | 5-15 | $0.02-0.10 |
| File Type | Avg Tokens/Line | 100-Line File |
|---|---|---|
| TypeScript | ~10 | ~1,000 |
| Python | ~8 | ~800 |
| JSON | ~6 | ~600 |
| Markdown | ~5 | ~500 |
| YAML | ~5 | ~500 |
Ordered by impact:
head, tail, --limit on commandsFor teams and repeat workflows:
| Metric | Formula | Target |
|---|---|---|
| Cost per commit | total session cost / commits produced | < $1.00 |
| Context efficiency | useful output tokens / total input tokens | > 15% |
| Cache hit rate | cache read tokens / total input tokens | > 50% |
| Tokens per task | total tokens / tasks completed | decreasing over time |
/cost periodically