From claude-impl-tools
Optimizes long contexts by extracting key info with H2O heavy-hitters, compressing documents, summarizing files, and pruning stale tool results. Use for context overflow, large docs, or multi-file synthesis.
npx claudepluginhub insightflo/claude-impl-tools --plugin claude-impl-toolsThis skill uses the workspace's default tool permissions.
> **When to use:**
Analyzes context usage and recommends compression strategies for bloated or quota-heavy Claude Code sessions, including /clear + /catchup, archiving, and spawning subagents.
Compresses conversation history in long-running agent sessions exceeding context limits, using anchored iterative summarization, opaque compression, and regenerative summaries to optimize tokens-per-task.
Compresses conversation history in long-running agent sessions using anchored summarization, opaque compression, or regenerative summaries to optimize tokens-per-task and prevent information loss.
Share bugs, ideas, or general feedback.
When to use:
- When you need to analyze a long document or codebase
- When the context window is running low
- When you need to synthesize multiple files
- When cleaning up documents before starting a project implementation
- When accumulated search/tool results are polluting the context (v2.0 self-editing)
# Extract key information (Heavy-Hitter)
/compress optimize <file>
# Compress a document
/compress <file>
# LLM-based summarization (requires Claude CLI)
/compress <file> --llm
Situation: Planning docs and specs are too long to read in one pass
Solution: /compress optimize docs/spec.md --heavy-count=20
Result: Extract only the top 20 key items for quick understanding
Situation: "Context window exceeded" or degraded response quality
Solution: /compress <large-file> --summary-ratio=0.3
Result: 70% compression to free up context headroom
Situation: Need to reference 10+ files at once
Solution: /compress build "summarize" docs/*.md
Result: RAG hybrid extracts only relevant content
| Command | Description | Example |
|---|---|---|
optimize <file> | Heavy-Hitter extraction | /compress optimize spec.md |
compress <file> | Compress (preserve start/end) | /compress README.md |
build <query> <files> | RAG hybrid | /compress build "API list" src/*.ts |
| Option | Description | Default |
|---|---|---|
--heavy-count=N | Number of key items to extract | 10 |
--summary-ratio=N | Compression ratio (0.1β0.9) | 0.3 |
--llm | Use LLM-based summarization | false |
--json | Output in JSON format | false |
Places critical information at the top to mitigate the "Lost in the Middle" phenomenon:
| Type | Priority | Example |
|---|---|---|
| h1 header | 1 | # Title |
| Class definition | 1 | class Foo |
| h2 header | 2 | ## Section |
| Function definition | 2 | function bar() |
| Table header | 2 | | col1 | col2 | |
| Code block | 3 | ```javascript |
| List | 4 | - item |
Bonus system:
CRITICAL, IMPORTANT, π₯): multiplier 0.5xOlder or less important content is summarized; recent and critical content is preserved as-is:
[First 5 lines β preserved as-is]
... (compressed) ...
[Middle sampling]
... (compressed) ...
[Last 5 lines β preserved as-is]
--llm)Semantic summarization using Claude CLI:
This skill internally calls contextOptimizer.js:
node project-team/services/contextOptimizer.js <command> <file> [options]
# Heavy-Hitter extraction
node project-team/services/contextOptimizer.js optimize docs/spec.md --heavy-count=15 --json
# Compress
node project-team/services/contextOptimizer.js compress large-file.md --summary-ratio=0.2
# LLM-based (run in a separate terminal)
node project-team/services/contextOptimizer.js compress large-file.md --llm
# RAG hybrid
node project-team/services/contextOptimizer.js build "API endpoints" src/*.ts
Inspired by Chroma Context-1: "Model selectively removes irrelevant documents during retrieval to free context capacity." Context-1 achieves 94.1% prune accuracy with this pattern.
During long sessions, tool results accumulate in the conversation:
This is context pollution β the same problem Context-1 solves for retrieval.
/prune β Self-Editing CommandWhen context feels polluted, run /prune (or /compress prune). The process:
1. SCAN: Review all tool results and file reads in the current conversation
2. CLASSIFY each result:
- KEEP: Still relevant to the current task
- STALE: Was relevant but task has moved on
- NOISE: Was never relevant (wrong search, failed attempt)
3. SUMMARIZE stale/noise items into one-line summaries
4. OUTPUT: A compact context summary replacing the bloated results
| Signal | Classification | Action |
|---|---|---|
| File read β file was later modified | STALE | Summarize: "read X, since modified" |
| Search result β query was refined | STALE | Summarize: "searched X, refined to Y" |
| Error output β error was fixed | NOISE | Summarize: "fixed error in X" |
| Debugging trace β bug resolved | NOISE | Drop entirely |
| Tool result still referenced in current task | KEEP | Preserve as-is |
| Recent result (last 3 turns) | KEEP | Preserve as-is |
Proactively suggest /prune when:
/prune is lighter than /compact:
/prune β removes noise from tool results within the active session/compact β compresses the entire conversation for a fresh startUse /prune first. If still overloaded, then /compact.
project-team/services/README.mdproject-team/services/mcp-context-server.js