Audit context window usage and identify optimization opportunities
Audits context window usage and identifies token optimization opportunities.
/plugin marketplace add standardbeagle/standardbeagle-tools/plugin install prompt-engineer@standardbeagle-toolsAudit context window usage to identify optimization opportunities based on 2026 context engineering best practices.
Identify what's consuming context:
Question: "What context sources should I audit?"
For each context source, estimate tokens:
## Token Usage Report
| Source | Estimated Tokens | % of Budget | Priority |
|--------|------------------|-------------|----------|
| System Prompt | ~X | Y% | High |
| Tool Definitions | ~X | Y% | Medium |
| Retrieved Docs | ~X | Y% | High |
| Conversation | ~X | Y% | Medium |
| Examples | ~X | Y% | Low |
| **Total** | **~X** | **Y%** | |
Token Budget: [Model's context window]
Available for Response: [Remaining tokens]
For each source, evaluate using the Signal-to-Noise Ratio:
## Context Quality Analysis
### System Prompt
- **High-signal content**: [What directly contributes to task success]
- **Low-signal content**: [What could be removed without impact]
- **Redundancy**: [Repeated or overlapping information]
- **SNR Score**: X/10
### Retrieved Documents
- **Relevance**: How well do chunks match the query?
- **Freshness**: Is information current?
- **Overlap**: Do chunks repeat information?
- **SNR Score**: X/10
### Tool Definitions
- **Clarity**: Are descriptions unambiguous?
- **Overlap**: Do tools have redundant capabilities?
- **Completeness**: Are all parameters documented?
- **SNR Score**: X/10
Check for "context rot" - degradation of recall with increased context:
## Context Rot Analysis
### Position Effects
- **Primacy bias**: Is critical information at the start?
- **Recency bias**: Is important context near the end?
- **Lost in the middle**: Is key info buried in large blocks?
### Recommendations
1. Move critical instructions to [position]
2. Split large document into [sections]
3. Add summary headers for [content type]
Recommend compression strategies:
## Compression Recommendations
### 1. Summarization
**Before (500 tokens)**:
[Long passage]
**After (100 tokens)**:
[Summarized version preserving key facts]
**Savings**: 400 tokens (80%)
### 2. Reference Deduplication
**Issue**: Same information repeated in X places
**Solution**: Reference once, link elsewhere
**Savings**: ~Y tokens
### 3. Just-in-Time Loading
**Issue**: Static context includes rarely-used information
**Solution**: Load dynamically when needed using tools
**Savings**: ~Z tokens on average
### 4. Structured Compression
**Before**:
"The user's name is John. John lives in New York. John works as an engineer."
**After**:
"User: John | Location: New York | Role: Engineer"
**Savings**: ~X tokens
Based on use case, recommend memory strategy:
## Memory Architecture
### Recommended Pattern: [Pattern Name]
**Short-term (in-context)**:
- Current task context
- Recent conversation (last N turns)
- Active tool results
**Long-term (external)**:
- User preferences
- Historical summaries
- Reference documentation
**Implementation**:
1. Use [storage mechanism] for long-term
2. Retrieve with [retrieval strategy]
3. Compress with [compression technique]
4. Refresh every [interval]
## Context Optimization Plan
### Quick Wins (Immediate)
1. [Action]: Save ~X tokens
2. [Action]: Save ~Y tokens
### Medium-term Improvements
1. [Structural change]: Save ~X tokens, improve Y
2. [Architecture change]: Enable Z capability
### Long-term Refactoring
1. [Major change]: Estimated impact
### Projected Results
- **Current usage**: X tokens
- **After quick wins**: Y tokens (-Z%)
- **After full optimization**: W tokens (-V%)
## Context Monitoring
### Metrics to Track
- Average context size per request
- Context utilization vs. budget
- Retrieval relevance scores
- Response quality vs. context size
### Warning Signs
- Context consistently >80% of budget
- Retrieval precision dropping
- Response quality declining with larger contexts
- Frequent context limit errors
### Tools
- Token counter integration
- Context size logging
- Quality correlation analysis
Balance specificity with flexibility:
"Find the smallest possible set of high-signal tokens that maximize the likelihood of the desired outcome."
Maintain lightweight identifiers and load data dynamically:
Structure information by access frequency: