npx claudepluginhub rico-0-3/claude-context-compressor --plugin context-compressorReduce Claude Code input token usage by 10-25% offline (zero cost) or 60-75% with API (optional). Offline by default — no API key needed. Works on Windows, Mac, Linux.
Companion to caveman (output compression). This compresses input: session memory, CLAUDE.md, auto-memory, tool outputs.
claude plugin install github:rico03/claude-context-compressor
That's it. No configuration needed. Hooks activate automatically on next session.
Requirements: Python 3.9+ (any version, auto-detected on all platforms)
Every Claude Code session re-injects your full CLAUDE.md and auto-memory. After 10+ sessions that's thousands of redundant tokens on every message. This plugin intercepts and compresses them.
| Hook | Event | What it compresses |
|---|---|---|
SessionStart | Session open | CLAUDE.md + auto-memory → compressed |
UserPromptSubmit | Each message | facts store → only relevant facts injected |
PostToolUse | After Bash/Read/Grep | large outputs → truncated |
Stop | Session end | extracts key facts → merges into facts store |
PreCompact | Before /compact | backs up critical facts |
Facts store merge logic:
current_state, preferences, bugs → latest value overrides olddecisions, architecture → accumulate forever (never lost)Optional. Create .claude/compress.config.json in your project to override defaults:
{
"level": "input-only"
}
| Level | Input | Output | Notes |
|---|---|---|---|
off | ❌ | ❌ | Disable everything |
input-only | ✅ aggressive | ❌ | Default |
balanced | ✅ light | caveman lite | Pair with caveman |
max | ✅ aggressive | caveman ultra | Maximum savings |
By default the plugin runs fully offline using heuristic compression (no API key required, zero cost). To unlock higher compression via Claude API:
{
"use_api": true,
"model": "claude-haiku-4-5-20251001"
}
API mode adds: semantic memory compression, smart fact selection per prompt, intelligent tool output summarization, and session fact extraction.
{
"input": {
"enabled": true,
"memory_compression": "aggressive",
"tool_output_limit": 200,
"min_tokens_to_compress": 100
}
}
After install, use /compress in Claude Code:
/compress status — current config + tokens saved this session/compress level max — switch preset/compress log — last 10 compression events/compress facts — view current facts storepython -c "
import json
from pathlib import Path
log = Path('.claude/token_log.jsonl')
lines = [json.loads(l) for l in log.read_text().splitlines() if l]
total = sum(l['saved'] for l in lines)
for l in lines[-5:]:
print(f' [{l[\"hook\"]:25}] {l[\"before\"]:>5}→{l[\"after\"]:>5} ({l[\"reduction_pct\"]:>5.1f}%)')
print(f'\n Total saved: {total:,} tokens')
"
claude plugin install github:JuliusBrussee/caveman
claude plugin install github:rico03/claude-context-compressor
Set "level": "max" in your config. Together: ~70% total token reduction.
MIT
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Executes bash commands
Hook triggers when Bash tool is used
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Intelligent prompt optimization using skill-based architecture. Enriches vague prompts with research-based clarifying questions before Claude Code executes them
Persistent memory system for Claude Code - seamlessly preserve context across sessions
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.