Use when context is growing large, responses are slow, or you need to optimize token budget across system prompt, background processes, and model selection
From superpowers-eccnpx claudepluginhub aman-2709/superpowers-ecc --plugin superpowers-eccThis skill uses the workspace's default tool permissions.
Provides guidance on returns authorization, receipt inspection, condition grading, disposition routing, refund processing, fraud detection, and warranty claims in e-commerce operations.
Scans installed skills to extract principles shared across 2+ skills and distills them into rules by appending, revising, or creating rule files.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Strategies for managing and reducing token consumption across Claude Code sessions. Complements the strategic-compact skill (which covers compaction mechanics) — this skill covers everything else in token budget management.
The system prompt (CLAUDE.md, settings, skills) is injected on every turn. Every byte here is multiplied by the number of turns in your session.
# BEFORE (bloated — 2000 tokens)
## Project Overview
This is a Next.js application that uses TypeScript and Tailwind CSS.
We follow the Airbnb style guide with some modifications. The project
was started in January 2024 and uses pnpm as the package manager.
Our testing framework is Vitest with React Testing Library...
[3 more paragraphs of prose]
## Coding Standards
- Always use TypeScript strict mode
- Always use functional components
- Always use named exports
- Always add JSDoc comments to public functions
- Always handle errors with try/catch
- Always use semantic HTML
[20 more "always" rules]
# AFTER (lean — 400 tokens)
## Stack
Next.js 14, TypeScript (strict), Tailwind, Vitest, pnpm
## Standards
Functional components, named exports, semantic HTML.
Error handling: try/catch with typed errors.
See eslint config for style rules — don't duplicate here.
Rules of thumb:
docs/architecture.md" instead of copying architecture docs# BEFORE: 500 tokens of API design rules inlined in CLAUDE.md
## API Design
[entire api-design skill content pasted here]
# AFTER: 10 tokens
Follow the `api-design` skill for API conventions.
Skills are loaded on-demand when relevant. CLAUDE.md is loaded on every turn.
Common sources of redundancy:
# Check total CLAUDE.md size
wc -c .claude/CLAUDE.md ~/.claude/CLAUDE.md 2>/dev/null
# Rough token estimate (1 token ~ 4 chars for English)
wc -c .claude/CLAUDE.md | awk '{printf "~%d tokens\n", $1/4}'
Hooks (PreToolUse, PostToolUse) and background processes consume tokens on every tool call. A verbose hook that runs on every Write or Edit can silently burn thousands of tokens per session.
# List all configured hooks
cat .claude/settings.json | jq '.hooks'
# Check hook script sizes and output verbosity
# Run a hook manually and measure output size
echo "test" | wc -c # baseline
# BEFORE: Hook dumps full lint output (500+ lines)
npm run lint
# AFTER: Hook reports only count
npm run lint --quiet 2>&1 | tail -1
# Or even better: exit code only
npm run lint --quiet > /dev/null 2>&1 && echo "OK" || echo "LINT FAIL"
| Strategy | Token Impact | Example |
|---|---|---|
| Reduce output verbosity | High | --quiet, --silent, pipe to /dev/null |
| Run conditionally | High | Only lint changed files, not entire project |
| Batch hook execution | Medium | Combine multiple checks into one hook |
| Remove redundant hooks | High | Don't lint on every edit if you lint pre-commit |
For a session with N tool calls and a hook that produces M tokens of output:
Not every task needs the most capable (and most expensive) model. Route tasks to the right model for the best cost/quality tradeoff.
| Model | Best For | Cost | Quality |
|---|---|---|---|
| Opus | Complex architecture, multi-file refactoring, nuanced code review, novel problem-solving | Highest | Highest |
| Sonnet | Standard feature implementation, bug fixes, test writing, most day-to-day coding | Medium | High |
| Haiku | Simple edits, formatting, boilerplate generation, quick lookups, file renaming | Lowest | Good for simple tasks |
Use Opus when:
Use Sonnet when:
Use Haiku when:
Switch models mid-session when the task complexity changes:
Signs your context is getting large:
| Situation | Action |
|---|---|
| Completed a major milestone, starting new work | /compact (see strategic-compact skill) |
| Context is large but you're mid-task | Continue — compacting mid-task loses critical context |
| Exploring many files but haven't started implementing | /compact with a summary of findings, then implement |
| Session has drifted far from original task | Start fresh — compaction can't recover lost focus |
| Debugging with many failed attempts in context | Start fresh — old failures add noise, not signal |
Read selectively:
# BEFORE: Read entire file (2000 lines)
Read file.ts
# AFTER: Read only what you need
Read file.ts lines 45-80
Search before reading:
# BEFORE: Read 5 files looking for a function
# AFTER: Grep to find it, then read only that file
rg "function handleAuth" --type ts -l
Limit tool output:
# BEFORE: Full test output (500 lines)
npm test
# AFTER: Summary only
npm test 2>&1 | tail -20
For multi-phase tasks, plan your token budget:
Phase 1: Research (explore files, understand codebase)
→ Target: 30% of budget
→ Compact after research, keep findings summary
Phase 2: Plan (design solution, get approval)
→ Target: 10% of budget
→ Compact after plan approval, keep the plan
Phase 3: Implement (write code)
→ Target: 40% of budget
→ Compact between major components if needed
Phase 4: Verify (tests, review)
→ Target: 20% of budget
→ No compaction needed — session is ending
Rough cost awareness for planning: