Claude API cost awareness — token estimation, cost drivers, and efficiency strategies for Claude Code sessions
From clarcnpx claudepluginhub marvinrichter/clarc --plugin clarcThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Real-world data from Claude Code users:
Token visibility is the #1 requested feature across all major AI coding tools.
clarc tracks this automatically via the session-end hook.
| Model | Input | Output |
|---|---|---|
| Claude Haiku 4.5 | ~$0.25 / M tokens | ~$1.25 / M tokens |
| Claude Sonnet 4.6 | ~$3.00 / M tokens | ~$15.00 / M tokens |
| Claude Opus 4.6 | ~$15.00 / M tokens | ~$75.00 / M tokens |
Prices subject to change. Always verify at console.anthropic.com.
Rule of thumb: Output tokens cost 5× more than input tokens. Minimize verbose output.
EXPENSIVE: Read entire 2000-line file to find one function
CHEAP: Grep for the function, then Read only the relevant 20 lines
Each agent call = new context window = new cost. 5 sequential agents on a large codebase can easily cost $1–2.
Vision inputs are input-token-heavy. One screenshot ≈ 1,000–5,000 tokens.
Context accumulates. A 4-hour session without /compact may carry 200k+ tokens
of prior context into every new tool call. (/compact is a built-in Claude Code command, not a clarc command.)
Glob **/* on a large repo returns thousands of paths — all as input tokens.
Instead of loading an entire file, locate the relevant lines first:
// EXPENSIVE — loads all 500 lines into context:
Read { file_path: "src/api/users.ts" }
// CHEAP — finds the line number first, then reads only 30 lines:
Grep { pattern: "getUserById", path: "src/api/", output_mode: "content", -n: true }
// → src/api/users.ts:47:export async function getUserById(id: string) {
Read { file_path: "src/api/users.ts", offset: 47, limit: 30 }
On a 500-line file this saves ~470 lines of input tokens per lookup. Across a 50-file session the savings compound to tens of thousands of tokens.
Haiku is ~8× cheaper than Sonnet. Use it for:
Run /compact (built-in Claude Code command) when context > 60% full. The summary costs ~$0.01 and saves
much more in subsequent calls.
Clear task boundaries prevent scope creep. "Fix the null check in getUserById" is 10× cheaper than "review the whole auth module."
Agents protect the main context. A Haiku sub-agent doing file analysis costs far less than loading all those files into the Sonnet main context.
clarc automatically logs estimated session costs to ~/.clarc/cost-log.jsonl.
# View recent cost log entries
tail -5 ~/.clarc/cost-log.jsonl | jq .
# Run /session-cost command for a formatted summary
/session-cost
Example ~/.clarc/cost-log.jsonl entries:
{"date":"2026-03-12","session_id":"ses_abc123","model":"claude-sonnet-4-6","tool_calls":{"Read":14,"Grep":8,"Edit":6,"Bash":4,"Agent":1},"est_input_tokens":42000,"est_output_tokens":8500,"est_cost_usd":0.25,"duration_min":22}
{"date":"2026-03-12","session_id":"ses_def456","model":"claude-sonnet-4-6","tool_calls":{"Read":31,"Grep":5,"Edit":12,"Bash":9,"Agent":3},"est_input_tokens":98000,"est_output_tokens":21000,"est_cost_usd":0.61,"duration_min":51}
{"date":"2026-03-11","session_id":"ses_ghi789","model":"claude-opus-4-6","tool_calls":{"Read":8,"Grep":3,"Edit":2,"Bash":2,"Agent":0},"est_input_tokens":18000,"est_output_tokens":4200,"est_cost_usd":0.59,"duration_min":14}
Key fields: tool_calls shows which tools drove cost; est_cost_usd is the session estimate; Agent calls are the most expensive per-call (each spawns a new context window).
Important: These are estimates based on tool-call count heuristics. For exact costs, check console.anthropic.com → Billing.
| Task | Model | Why |
|---|---|---|
| Code review, TDD, standard dev | Sonnet | Best quality/cost ratio |
| Lightweight analysis, summaries | Haiku | ~8× cost savings |
| Architecture decisions, complex debugging | Opus | Deep reasoning needed |
| Worker agents in pipelines | Haiku | High volume, lower stakes |
| Orchestrator in multi-agent | Sonnet | Coordination complexity |
/session-cost — view session cost summaryscripts/hooks/auto-checkpoint.js — checkpoint before expensive operationsskills/cost-aware-llm-pipeline — designing cost-efficient multi-agent pipelines