Help us improve
Share bugs, ideas, or general feedback.
From prompt-eng-toolkit
Use when an existing prompt is too long, costs too much per call, has poor cache hit-rate, or models stopped adhering to its rules (often after a model version bump). Compresses tokens AND hardens adherence in one pass — these usually go together. Always produces a before/after token comparison table. Iterates against real provider APIs when an API key is available; falls back to theory-only review if not.
npx claudepluginhub rxchi1d/prompt-eng-toolkit --plugin prompt-eng-toolkitHow this skill is triggered — by the user, by Claude, or both
Slash command
/prompt-eng-toolkit:prompt-optimizeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Compress and harden an existing prompt. The two goals — **token reduction** and **adherence hardening** — are usually compatible: bloat dilutes critical rules, so trimming often fixes adherence as a side effect.
Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
Compress and harden an existing prompt. The two goals — token reduction and adherence hardening — are usually compatible: bloat dilutes critical rules, so trimming often fixes adherence as a side effect.
Before changing anything, extract:
MUST, Gemini 3 over-analyzes verbose prompts, etc.).Run ../../shared/scripts/count_tokens.py to get the current token count. Record:
Without a baseline, do not start editing. "It feels shorter" is not a result.
Same flow as prompt-create Step 2. Check env vars first; if no key found, ask once:
"I can use a real provider API to A/B test the optimization (strongly recommended — token down + behavior broken is a classic trap). Want to provide an API key? If yes — share provider + model + key (env-only). If no — I'll do a static compression + structure check without behavioral verification."
Read ../../shared/references/optimization-playbook.md for full patterns. Summary of the priority ladder (compress in this order; stop when budget met):
| Priority | Target | Why first |
|---|---|---|
| 1 | Duplicate SECURITY/WARNING wrapper paragraphs | If <final> already has the rule, repeating wastes tokens |
| 2 | Generic "helpful AI" preamble | A task-specific persona supersedes it |
| 3 | Step-by-step process descriptions | Replace with outcome-first directives |
| 4 | Multiple negative directives saying the same thing | Merge into one rule + rationale |
| 5 | Examples not tied to a failure mode | Each example must demonstrate one behavior class |
| 6 (last) | XML tag names themselves | Tags are the data/instruction firewall — only remove if you accept that risk |
Never compress these (treat as load-bearing):
<final> blockdraft = compressed_prompt
loop:
for fixture in fixtures (failure-mode + happy-path):
out = provider.generate(prompt=draft, input=fixture.input)
check assertions(out, fixture.assertions)
tokens = provider.count_tokens(draft)
if all_pass and tokens < baseline: break
draft = revise(draft) # iterate in $TMPDIR / scratch — do NOT touch source yet
write_to_destination(draft) # only after the loop terminates
Walk the optimization checklist below + the universal-principles checklist (../../shared/references/universal-principles.md §四). Be explicit about "static review only" in your summary.
Re-run ../../shared/scripts/count_tokens.py and produce a before/after table. Use the generic format below — do not assume the prompt has separate system/user halves, since many users pass everything as one blob:
| Segment | Before (tokens) | After (tokens) | Δ tokens | Δ % |
|--------------------|----------------:|---------------:|---------:|----:|
| <segment name 1> | … | … | … | … |
| <segment name 2> | … | … | … | … |
| **Total** | … | … | … | … |
Segment names are whatever the user's prompt naturally divides into:
system, user_templateprompt (one row + total — that's fine)The script supports --label to tag each measurement so you can build the table programmatically.
<final> block (or equivalent) still at the end of the system portionMUST/CRITICAL/ALWAYS introduced; no contradictory rules introducedProvide:
<final> covers them all")Paths below are relative to this SKILL.md's directory. ../../ resolves to the plugin root, where shared/ lives. Read on demand:
| Question | File |
|---|---|
| What compression patterns are high-ROI vs high-risk? | ../../shared/references/optimization-playbook.md |
| What attack patterns must the prompt still survive after compression? | ../../shared/references/failure-modes-and-defenses.md |
| What's the universal best-practice checklist? | ../../shared/references/universal-principles.md |
| What does the v4 template look like (target structure if rewriting)? | ../../shared/references/v4-template.md |
| How do I shape attack fixtures for my domain? | ../../shared/fixtures/attack-tests-template.yaml |
| How do I count tokens with the official provider API? | ../../shared/scripts/count_tokens.py --help |
NEVER X. shouting (overtriggers Claude 4.5+ / GPT-5)