From ctx
Audits Claude skills against Anthropic prompting best practices including positive framing, motivation, and XML structure. Use after creation/modification, before release, or for inconsistent results.
npx claudepluginhub activememory/ctx --plugin ctxThis skill uses the workspace's default tool permissions.
Audit one or more skills against Anthropic's prompting best
Audits Claude skills for structure, token efficiency, tool integration, compliance, and performance. Use for reviews, production prep, or quality optimization.
Audits Claude Code skills for structure compliance, triggering accuracy, instruction quality, and best practices. Scores 0-100 with prioritized improvement recommendations.
Audits Claude Code skills for quality, compliance, delegation patterns, and maintainability. Run after creating skills, before releases, or for periodic checks.
Share bugs, ideas, or general feedback.
Audit one or more skills against Anthropic's prompting best practices. The goal is to find patterns that degrade skill effectiveness with current Claude models and suggest concrete improvements.
references/anthropic-best-practices.md from this
skill's directory: it contains the condensed audit criteria.internal/assets/claude/skills/*/SKILL.md.
For live skills, read from .claude/skills/*/SKILL.md.Apply these checks to each skill. Each dimension maps to a section in the best practices reference.
Scan for negative instructions ("don't", "never", "avoid", "do not") that lack a positive counterpart. Every negative should be paired with what the agent should do instead.
Pass: negative instructions are supplements to clear positive guidance. Fail: primary instructions are negative, leaving the agent to guess the desired behavior.
Do not create new files. Do not modify tests. Do not add comments. Edit only the files specified in the task. Preserve existing tests and comments: add new ones only when the user requests them.Check for MUST, NEVER, ALWAYS, CRITICAL used as emphasis without explaining why the rule matters. Claude 4.5/4.6 responds better to reasoning than rigid directives.
Pass: important instructions include motivation ("because X" or "so that Y") that lets the model generalize. Fail: instructions rely on emphasis alone to convey importance.
You MUST ALWAYS run tests before reporting completion. Run tests before reporting completion: untested changes create silent regressions that compound across sessions.Check whether the skill mixes instructions with variable content (file paths, user input, injected code) without clear delimiters. XML tags prevent the model from confusing injected content with skill instructions.
Pass: variable content is wrapped in descriptive tags, or the skill doesn't inject variable content. Fail: the skill templates in external content alongside instructions without delimiters.
Check whether non-trivial behaviors (output formats, decision logic, style requirements) are demonstrated with examples. Skills that describe complex output without showing it drift over time.
Pass: key behaviors have at least one good/bad example pair, or the behavior is simple enough that examples would be redundant. Fail: the skill describes a specific output format or decision process but provides no examples.
If the skill spawns or encourages spawning subagents (via the Agent tool), check that it states when subagents are and aren't warranted. Claude Opus 4.6 over-delegates to subagents when a direct tool call would be faster.
Pass: subagent usage has explicit scope (when to use, when not to), or the skill doesn't involve subagents. Fail: the skill defaults to subagent delegation without stating when direct execution is preferable.
Check for language written to combat undertriggering in older models that may cause overtriggering in Claude 4.5/4.6: excessive caps emphasis (CRITICAL, MUST), redundant capability statements ("You are an expert"), or aggressive always/never framing.
Pass: instructions use natural language with emphasis reserved for genuinely critical points. Fail: the skill reads like it was written for a less capable model that needed constant nudging.
Every file path, tool name, and command referenced in the skill must exist. Broken references are a form of hallucination in the skill itself.
Pass: all references resolve to real files/tools. Fail: the skill mentions files or commands that don't exist.
Check whether the skill encourages work beyond what's requested: "while you're in there" improvements, unsolicited refactoring, or scope creep. Skills should state the minimum viable outcome.
Pass: the skill's scope matches its stated purpose. Fail: the skill encourages additional work beyond its core task.
The description field determines when the skill activates.
Check that it:
Pass: reading the description alone, you'd know exactly when to use this skill. Fail: the description is vague ("use for general tasks") or too narrow (misses common phrasings).
For each audited skill, report:
### /skill-name
**Overall:** X/9 pass
| # | Dimension | Result | Notes |
|---|------------------------|--------|--------------------------|
| 1 | Positive framing | pass | |
| 2 | Motivation over mandates | fail | 3 bare MUST/NEVER found |
| 3 | XML tag structure | pass | |
| 4 | Few-shot examples | fail | No output format example |
| 5 | Subagent guard | n/a | No subagent usage |
| 6 | Overtriggering | pass | |
| 7 | Phantom references | pass | |
| 8 | Scope discipline | pass | |
| 9 | Description quality | warn | Missing synonym coverage |
**Suggested fixes:**
- [Dimension 2] Line "You MUST ALWAYS run tests" →
"Run tests before completion: untested changes create
silent regressions."
- [Dimension 4] Add example showing expected output format
after the "Report results" section.
For batch audits, end with a summary:
## Batch Summary
| Skill | Score | Top Issue |
|--------------------|-------|--------------------------|
| ctx-commit | 8/9 | Missing example |
| ctx-drift | 7/9 | 2 bare mandates |
| ctx-verify | 9/9 | - |
Before reporting audit results: