npx claudepluginhub cwinvestments/memstack --plugin memstackThis skill uses the workspace's default tool permissions.
*Monitor and manage Headroom context compression for CC sessions.*
Guides setup of 3-layer token optimization stack—Headroom (API compression), RTK (CLI output compression), Serena (LSP code navigation)—to reduce Claude Code token use by 50-80%. For context window management.
Optimizes Claude Code sessions for Max-plan token limits through response compression, tool output filtering, drift protection, and planning for broad tasks. Useful for quota management and efficiency.
Analyzes context usage and recommends compression strategies for bloated or quota-heavy Claude Code sessions, including /clear + /catchup, archiving, and spawning subagents.
Share bugs, ideas, or general feedback.
Monitor and manage Headroom context compression for CC sessions.
When this skill activates, output:
⚙️ Compress — Checking Headroom status...
Then execute the protocol below.
| Context | Status |
|---|---|
| User says "headroom", "compression stats", "check proxy" | ACTIVE — run status check |
| User asks about token savings or context window | ACTIVE — run session report |
| Proxy errors or API connection failures appear | ACTIVE — run health diagnostics |
| General discussion about CC features | DORMANT — do not activate |
| User is actively coding (no proxy issues) | DORMANT — do not activate |
Headroom is a transparent proxy between Claude Code and the Anthropic API that compresses tool outputs by removing redundant boilerplate. It extends effective context window by 30–40%.
This skill checks proxy health, reports compression stats, and troubleshoots connection issues.
pip install headroom-ai[code]
[code] extra installs tree-sitter for AST-based code compression. Without it, Code-Aware compression is disabled and CC sessions get 0% compression.headroom proxy --llmlingua-device cpu (defaults to localhost:8787)ANTHROPIC_BASE_URL=http://127.0.0.1:8787headroom proxy --llmlingua-device cpu
--llmlingua-device cpu — Forces LLMLingua to use CPU (avoids silent CUDA failures on machines without GPU)Run:
curl -s http://127.0.0.1:8787/stats | python -m json.tool
Report: proxy up/down, requests processed, compression ratio, tokens saved, estimated cost savings.
If proxy is unreachable:
# Windows
tasklist | findstr headroom
# Linux/macOS
ps aux | grep headroom
netstat -ano | findstr 8787
ANTHROPIC_BASE_URL is set:
echo $ANTHROPIC_BASE_URL
headroom proxy in a separate terminalWhen triggered at session end or on request, report:
| Setting | Value | Notes |
|---|---|---|
| Proxy URL | http://127.0.0.1:8787 | Default port |
| Dashboard | AdminStack Infrastructure tab | Headroom monitoring panel |
| Repo | github.com/chopratejas/headroom | Apache 2.0 |
| Python | 3.14 compatible | Tested Feb 2026 |
| Symptom | Fix |
|---|---|
| 0% compression / 0.00x ratio | headroom-ai[code] is not installed. Run: pip install headroom-ai[code]. Restart proxy. |
| "Code-Aware: NOT INSTALLED" in startup banner | Same fix — install the [code] extra and restart. |
| Cost figures don't match Anthropic Console | Headroom estimates costs at list token prices without accounting for Anthropic's server-side prompt caching discounts. For actual costs, check console.anthropic.com. |
⚙️ Headroom Status
├── Proxy: ✅ Running on :8787
├── Requests: 47 processed
├── Compression: 46.2% reduction
├── Tokens saved: ~18,500 tokens
└── Cost savings: ~$0.28 this session
ANTHROPIC_BASE_URL is set[code] extra for tree-sitter AST compression, updated startup flags (--llmlingua-device cpu), added troubleshooting. Compression 0% → 46%. (Feb 24, 2026)