By ww-w-ai
Reduce Claude Code token costs by restoring context without LLM calls, disabling verbose git instructions, tracking real-time token usage in the CLI, and generating an interactive cost dashboard. Also includes a footnote mode to minimize token consumption per prompt.
Cheaper and faster than /compact. Restores previous session context by reading transcripts directly — no LLM calls, no token cost
Max plan hit the wall? 💀 Report your 5h window data — we're mapping the rate limit formula Anthropic won't publish
Disable Claude Code built-in git instructions and inject a curated 280-tok minimum via SessionStart hook. Saves ~2,200 tok/session + ~1,700 tok/call.
Live token counter in your CLI. Shows real-time input/output/cache token counts in the Claude Code status bar
Know exactly what you spent. Interactive HTML dashboard with cost breakdown, token usage, and 5-hour window timeline across all sessions
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
The only Claude Code plugin that actually reads CC's source code to find where your tokens go — and fixes it automatically. Spend less, code longer.
Measured result: 45% cost reduction on a real $326/day workload → $180/day. Cache expiry prevention, automatic SubTask delegation, zero-cost context restoration, and a full analytics dashboard — in one install, zero config.
Works with Max Plan ($200/mo) and API pay-per-use. Same plugin, same features. Stronger for every user — especially when every token is real money.

| Feature | What happens | Impact |
|---|---|---|
| 🛡️ Token Guardian | Detects cache expiry, blocks $9 re-sends before they happen | Prevents the #1 silent cost spike |
| 🧠 Session Architect | Auto-delegates to SubTasks (37.5% cheaper) + parallelizes tool calls to cut round-trips | Context stays small, round-trips drop, costs compound down |
| 🪶 Concise Mode | Decision-focused responses: essential context + choices, nothing else | Fewer output tokens, faster user decisions |
| 🔄 /continue | Replaces /compact — zero LLM calls, zero cost, zero info loss | Free context restoration |
| 📊 Status Line | Real-time cost, context size, rate limit — under 50ms | See problems before they cost you |
| 📈 /usage-view | Interactive HTML dashboard with AI-powered analysis | Full cost forensics in one click |
| ✂️ /setup-git-lite | Removes 2,200 hidden tokens CC injects every session | ~$48/mo saved on git instructions alone |
Cache expiry. You come back from lunch. Cache is gone. One prompt re-sends 900K tokens at full price. $9 in a single shot.
Invisible costs. No real-time visibility. No "your context is at 800K" warning. No "cache expired 3 minutes ago" alert. You find out after the damage is done.
Context bloat. The same prompt at 200K vs 800K context costs 4x more. Every Read, Grep, Edit re-sends the full context. One complex prompt triggers 15+ API calls, each multiplied by your context size.
All manual. Context management, cache expiry timing, SubTask delegation, session cleanup. Nobody can track all this while actually coding.
Max Plan ($200/mo)? All of the above, plus a 5-hour rate limit that kills your flow with no timer and no ETA.
API pay-per-use? All of the above, except there's no ceiling. One cache miss = $9 of real money. Ten times a week = $360/mo on accidents alone. A bad Tuesday with bloated context can cost more than a Max Plan subscriber pays in a month.
claude-code-token-saver handles all of it automatically. Install once. Done.
/plugin marketplace add ww-w-ai/marketplace
/plugin install claude-code-token-saver@ww-w-ai
Works automatically after install. Zero config. Requires Claude Code v2.1.71+.
For live monitoring:
/setup-statusline install
To trim 2,200 hidden tokens from CC's built-in git instructions (details):
/setup-git-lite install
Detects cache expiry and automatically blocks expensive re-sends.
Claude Code's prompt cache TTL is 1 hour. Step away for more than an hour and the cache expires. Your next message re-sends the entire context at full price. At 900K tokens, that's $9 in one shot.
Token Guardian tracks when the last response was received. If more than 3,590 seconds have passed (TTL minus 10-second buffer), it blocks the prompt and shows a warning.
🚨 Cache expired (68m 23s idle)
The prompt cache has expired. Continuing will resend the full context.
Cost may increase significantly.
👉 /context — Check current context usage before deciding
👉 /clear → /continue — Reset, then restore previous context (recommended, cheapest)
👉 Re-send — Continue as-is (full re-cache cost incurred)
Just re-send the same prompt after the warning -- it goes through. The warning only fires once per idle period, so it never nags. Warning messages display in 23 languages based on your OS locale.
Result: Every cache expiry caught = $9 saved. At one catch per day, that's $270/mo of pure waste eliminated.
If you're on API pay-per-use, this hits harder. Max Plan subscribers lose $9 inside a $200 buffer. You lose $9 of real money — silently, repeatedly, every time you step away. Token Guardian catches it every time.
Install it and cost-optimized work patterns kick in automatically.
Most users do everything in the Main session. File reads, code generation, test runs. Every output piles into context and is re-sent with every message. The session bloats. Costs snowball.
Session Architect automatically injects two cost strategies at session start:
npx claudepluginhub ww-w-ai/claude-code-token-saverReal-time session dashboard for Claude Code — cost tracking, analytics, and smart alerts
Open-source, local-first Claude Code plugin for token reduction, context compression, and cost optimization using hybrid RAG retrieval (BM25 + vector search), reranking, AST-aware chunking, and compact context packets.
Mission Control for Claude Code — auto-starts a web dashboard, provides 90 MCP tools (sessions, stats, live monitoring, projects, teams, insights, coaching, and more), and adds 9 skills including /session-recap, /daily-cost, /standup, /coaching, /insights, and /team-status.
Zero-setup cost tracking and budget management for Claude Code. Real-time burn rate, budget guardrails, ASCII reports, and pre-task cost estimation — all without external dependencies.
Live API usage in Claude Code statusline - colored progress bars, Git info, tokens, session metrics, device tracking, and more
Persistent memory system for AI coding sessions — cross-tool memory sharing with 6-dimensional hybrid search