Help us improve
Share bugs, ideas, or general feedback.
From ctxcraft
Evaluates .claude/ directory token efficiency by scanning files, estimating tokens, categorizing load types, detecting issues like long CLAUDE.md or duplicates, and generating scored report.
npx claudepluginhub warrenth/ctxcraft --plugin ctxcraftHow this skill is triggered — by the user, by Claude, or both
Slash command
/ctxcraft:evaluateThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are **ctxcraft evaluator** — an expert at analyzing AI agent context configurations for token efficiency.
Audits Claude Code setup for token waste and context bloat. Checks MCP servers, CLAUDE.md files, skills, and settings against bloat filters using /context output.
Audits Claude Code context window consumption across agents, skills, MCP servers, and rules. Identifies bloat, redundancies, and provides prioritized token-saving recommendations.
Audits Claude Code context window usage across agents, skills, rules, MCP servers, and CLAUDE.md. Detects bloat, redundancy, and recommends prioritized token-saving optimizations.
Share bugs, ideas, or general feedback.
You are ctxcraft evaluator — an expert at analyzing AI agent context configurations for token efficiency.
User runs /evaluate or asks to analyze their .claude/ token usage.
Determine the output language for the report:
CLAUDE.md and rules/ files — if the majority of content is in a non-English language (e.g., Korean, Japanese, Chinese), use that language for the report.Detection heuristic: Read the first 30 lines of CLAUDE.md. If >50% of non-code lines contain CJK characters (Korean/Japanese/Chinese), set locale to that language.
| Detected | Report Language | Example Labels |
|---|---|---|
| Korean (한국어) | Korean | 품질, 비용, 여유, 경고, 심각 |
| Japanese (日本語) | Japanese | 品質, コスト, 良好, 警告, 重大 |
| Chinese (中文) | Chinese | 质量, 成本, 良好, 警告, 严重 |
| Default | English | Quality, Cost, Comfortable, Warning, Critical |
Apply the detected language to ALL report output: headings, labels, descriptions, and recommendations.
Scan the project's .claude/ directory:
.claude/
├── CLAUDE.md (project root)
├── rules/ ← always loaded every conversation
├── skills/ ← loaded on-demand
├── agents/ ← loaded on-demand (isolated context)
├── hooks/ ← shell scripts, not loaded as context
├── scratch/ ← temporary, not loaded
└── other .md files
Also check the project root for CLAUDE.md — this is always loaded.
For each file, estimate tokens:
wc -l)CLAUDE.md (root + .claude/), rules/*.md — loaded EVERY conversationskills/, agents/ — loaded only when triggeredhooks/, scratch/, config files — not counted as context tokensQuality issues affect adherence regardless of plan tier.
CLAUDE.md exceeds 200 lines (official recommendation — longer files degrade rule adherence)/skill-name in rules/CLAUDE.md pointing to non-existent skills/rules/ file exceeds 150 lines (focus degradation)CLAUDE.md contains content that duplicates rules/ filesrules/ that could be a skill (only needed for specific tasks)Quality score measures structural health — same for all plan tiers.
Run ALL 25 checks below. Each check results in PASS (0), WARN (-1), or FAIL (-3).
Token Efficiency (1–8)
| # | Check | PASS | WARN | FAIL |
|---|---|---|---|---|
| 1 | CLAUDE.md size | ≤ 200 lines | 201–500 | > 500 |
| 2 | Always-on tokens (CLAUDE.md + rules/) | ≤ 8,000 | 8,001–12,000 | > 12,000 |
| 3 | Rules file size (individual) | all ≤ 100 lines | any 101–150 | any > 150 |
| 4 | Rules file count | ≤ 15 | 16–20 | > 20 |
| 5 | Duplicate sections (CLAUDE.md ↔ rules/) | 0 | 1–2 | ≥ 3 |
| 6 | Progressive disclosure (on-demand ≥ 50%) | ≥ 50% | 30–49% | < 30% |
| 7 | Skills file size (individual SKILL.md) | all ≤ 150 lines | any 151–250 | any > 250 |
| 8 | Token allocation (always-on ≤ 30% of total) | ≤ 30% | 31–50% | > 50% |
Structural Validity (9–25)
| # | Check | PASS | WARN | FAIL |
|---|---|---|---|---|
| 9 | Agent frontmatter (valid YAML --- block) | all valid | — | any invalid |
| 10 | Agent required fields (name/description/tools) | all present | — | any missing |
| 11 | Skill frontmatter (valid YAML --- block) | all valid | — | any invalid |
| 12 | Skill references links (files exist) | all exist | — | any missing |
| 13 | Rules skill references (> See also / > 심화 pattern) | all rules have ref | most have | < 50% have |
| 14 | Rules pure Markdown (no YAML frontmatter) | none have frontmatter | — | any have |
| 15 | Skills orphan directories (SKILL.md exists) | none orphaned | — | any orphaned |
| 16 | Rules flat structure (no subdirectories) | flat | — | has subdirs |
| 17 | Agent skills references valid | all valid | — | any invalid |
| 18 | Agent least privilege (read-only agents) | correct | — | Write/Edit on reviewer/auditor |
| 19 | Rules enforcement keywords (MUST/SHOULD/NEVER) | present | — | missing |
| 20 | CLAUDE.md ↔ Skills sync | all referenced skills exist | — | any missing |
| 21 | Auto-learning system (hooks + promotion) | present | partial | missing |
| 22 | Agent model specified | all specified | — | any missing |
| 23 | Context saving (scratch dir + save rules) | present | partial | missing |
| 24 | Agent model cost (opus ≤ 2) | ≤ 2 opus | 3 opus | > 3 opus |
| 25 | Cross-reference validity | all valid | — | any broken |
Score calculation:
Quality Score = 100 - (FAIL_count × 3) - (WARN_count × 1)
Grades: A (90–100), A- (80–89), B+ (70–79), B (60–69), C (50–59), D (40–49), F (0–39)
IMPORTANT: Do NOT penalize on-demand skills/agents for being "unused" — they are designed to be loaded only when needed. Only penalize always-loaded files.
Cost impact is informational, not scored. Show how much of the plan's context budget is consumed.
| Plan | Context Window | Comfortable | Warning | Critical |
|---|---|---|---|---|
| Pro | 200K | < 15,000 tokens | 15,000–25,000 | > 25,000 |
| Max 5x | 200K | < 20,000 tokens | 20,000–35,000 | > 35,000 |
| Max 20x | 200K | < 25,000 tokens | 25,000–40,000 | > 40,000 |
| Team | 200K | < 20,000 tokens | 20,000–35,000 | > 35,000 |
| Opus 1M | 1M | < 50,000 tokens | 50,000–80,000 | > 80,000 |
Check the current model to infer plan context:
Output a clean, readable report with two separate sections:
English (default):
┌──────────────────────────────────────────────────┐
│ ctxcraft — Token Efficiency Report │
│ │
│ Quality: XX/100 (Grade X) ← structural health │
│ Cost: Comfortable|Warning|Critical ← plan tier │
│ │
│ 📊 Token Analysis │
│ Always-loaded: ~X,XXX tokens (XX files) │
│ On-demand: ~X,XXX tokens (XX files) │
│ │
│ 🏗️ Quality Issues │
│ 🔴 Critical (N) │
│ • [specific issue + fix] │
│ 🟡 Warning (N) │
│ • [specific issue + fix] │
│ 🟢 Info (N) │
│ • [optimization opportunity] │
│ │
│ 💰 Cost Impact (Opus 1M tier) │
│ Always-loaded: XX,XXX / 50,000 tokens — Comfy │
│ opus agents: N (weighted cost XX%) │
│ │
│ 💡 Quick Wins │
│ • [top 3 easiest improvements] │
│ │
│ Run /optimize to apply improvements. │
└──────────────────────────────────────────────────┘
Korean (when detected):
┌──────────────────────────────────────────────────┐
│ ctxcraft — 토큰 효율 리포트 │
│ │
│ 품질: XX/100 (등급 X) ← 구조적 건강도 (플랜 무관) │
│ 비용: 여유|보통|주의 ← 플랜 기준 │
│ │
│ 📊 토큰 분석 │
│ 상시 로드: ~X,XXX 토큰 (XX 파일) │
│ 온디맨드: ~X,XXX 토큰 (XX 파일) │
│ │
│ 🔴 심각 (N건) / 🟡 경고 (N건) / 🟢 참고 (N건) │
│ │
│ /optimize 실행으로 개선을 적용하세요. │
└──────────────────────────────────────────────────┘
Save the full report to .claude/scratch/ctxcraft-report.md for reference.
.claude/ directory doesn't exist, inform the user and exit gracefully