From memstack
Monitors Headroom proxy health, reports compression stats (30-40% token savings), and troubleshoots connection issues for Claude Code sessions.
How this skill is triggered — by the user, by Claude, or both
Slash command
/memstack:compressThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
*Monitor and manage Headroom context compression for CC sessions.*
Monitor and manage Headroom context compression for CC sessions.
When this skill activates, output:
⚙️ Compress — Checking Headroom status...
Then execute the protocol below.
| Context | Status |
|---|---|
| User says "headroom", "compression stats", "check proxy" | ACTIVE — run status check |
| User asks about token savings or context window | ACTIVE — run session report |
| Proxy errors or API connection failures appear | ACTIVE — run health diagnostics |
| General discussion about CC features | DORMANT — do not activate |
| User is actively coding (no proxy issues) | DORMANT — do not activate |
Headroom is a transparent proxy between Claude Code and the Anthropic API that compresses tool outputs by removing redundant boilerplate. It extends effective context window by 30–40%.
This skill checks proxy health, reports compression stats, and troubleshoots connection issues.
pip install headroom-ai[code]
[code] extra installs tree-sitter for AST-based code compression. Without it, Code-Aware compression is disabled and CC sessions get 0% compression.headroom proxy --llmlingua-device cpu (defaults to localhost:8787)ANTHROPIC_BASE_URL=http://127.0.0.1:8787headroom proxy --llmlingua-device cpu
--llmlingua-device cpu — Forces LLMLingua to use CPU (avoids silent CUDA failures on machines without GPU)Run:
curl -s http://127.0.0.1:8787/stats | python -m json.tool
Report: proxy up/down, requests processed, compression ratio, tokens saved, estimated cost savings.
If proxy is unreachable:
# Windows
tasklist | findstr headroom
# Linux/macOS
ps aux | grep headroom
netstat -ano | findstr 8787
ANTHROPIC_BASE_URL is set:
echo $ANTHROPIC_BASE_URL
headroom proxy in a separate terminalWhen triggered at session end or on request, report:
| Setting | Value | Notes |
|---|---|---|
| Proxy URL | http://127.0.0.1:8787 | Default port |
| Dashboard | AdminStack Infrastructure tab | Headroom monitoring panel |
| Repo | github.com/chopratejas/headroom | Apache 2.0 |
| Python | 3.14 compatible | Tested Feb 2026 |
| Symptom | Fix |
|---|---|
| 0% compression / 0.00x ratio | headroom-ai[code] is not installed. Run: pip install headroom-ai[code]. Restart proxy. |
| "Code-Aware: NOT INSTALLED" in startup banner | Same fix — install the [code] extra and restart. |
| Cost figures don't match Anthropic Console | Headroom estimates costs at list token prices without accounting for Anthropic's server-side prompt caching discounts. For actual costs, check console.anthropic.com. |
⚙️ Headroom Status
├── Proxy: ✅ Running on :8787
├── Requests: 47 processed
├── Compression: 46.2% reduction
├── Tokens saved: ~18,500 tokens
└── Cost savings: ~$0.28 this session
ANTHROPIC_BASE_URL is set[code] extra for tree-sitter AST compression, updated startup flags (--llmlingua-device cpu), added troubleshooting. Compression 0% → 46%. (Feb 24, 2026)npx claudepluginhub cwinvestments/memstack --plugin memstackOptimizes Claude Code sessions for Max-plan token limits via response compression, tool output filtering, drift prevention, and planning for broad tasks.
Analyzes context usage and recommends compression strategies for bloated or quota-heavy Claude Code sessions, including /clear + /catchup, archiving, and spawning subagents.
Guides setup of the 3-layer token optimization stack: Headroom (API compression), RTK (CLI output compression), and Serena (LSP-backed code navigation). Activates on token savings or context window queries.