From claude-skills
Audits Claude Code skills for violations, gaps, and improvements in frontmatter, structure, and quality across 7 dimensions. Outputs structured repair plans with severities.
npx claudepluginhub gupsammy/claudest --plugin claude-skillsThis skill uses the workspace's default tool permissions.
Audit and improve an existing skill against a gold standard. Unlike create-skill (which
Refines and validates existing Claude Code skills for clarity, efficiency, and production readiness. Use for improving structure, best practices, token reduction, and production checks.
Evaluates SKILL.md files for design quality against official specs and best practices, with multi-dimensional scoring and improvement suggestions.
Analyzes and refines skills by identifying issues like time estimates, oversized files, poor structure, redundant content; prioritizes fixes (MUST/SHOULD/NICE); implements improvements with user feedback.
Share bugs, ideas, or general feedback.
Audit and improve an existing skill against a gold standard. Unlike create-skill (which generates from scratch), this skill diagnoses violations and identifies gaps — what is broken, what is missing, and what would raise quality. The output is a structured improvement plan covering all dimensions.
Read $ARGUMENTS as the path to a skill directory or SKILL.md file.
SKILL.md, then list and note which of references/, examples/,
scripts/, assets/ exist and which are referenced from SKILL.mdIf the path is missing or ambiguous, use AskUserQuestion to resolve before proceeding.
Load all three reference files before Phase 2:
${CLAUDE_PLUGIN_ROOT}/skills/repair-skill/references/skill-anatomy.md — gold standard for
correct anatomy, three-level loading model, directory type definitions, degrees of
freedom, naming conventions, body conventions. Required for Dimensions 5, 6, and 7.${CLAUDE_PLUGIN_ROOT}/skills/repair-skill/references/frontmatter-options.md — complete
frontmatter field catalog, valid values, tool list, tool selection framework.
Required for Dimensions 1 and 2.${CLAUDE_PLUGIN_ROOT}/skills/repair-skill/references/audit-calibration.md — known
false-positive patterns that look like violations but are not. Prevents over-flagging
on D2 (allowed-tools absent), D4 (Task/Skill prose), and D5 (orientation vs routing).Proceed to Phase 2 when: SKILL.md is read, sibling directories are cataloged, and all three reference files are loaded.
Run each dimension independently. For each finding record: the dimension code, what is wrong or missing, which principle it violates or which gold standard it falls short of, and the specific change required. Proceed to Phase 3 when all 7 dimensions are evaluated.
Finding types:
Severity:
The description is the only part of a skill that is always in context. Every token here costs budget across every session. Audit for violations and gaps:
Violations:
> (folded scalar)? The | literal scalar
preserves newlines and can produce unexpected whitespace when parsed. Minor.Gaps:
argument-hint: Does the skill read $ARGUMENTS or $1/$$2 without an
argument-hint field? The hint is shown in autocomplete — its absence means users don't
know what to pass. Minor.Modifiers left at their defaults are not errors — omitting them is correct when defaults apply. Audit for mismatches (violations) and missing configuration (gaps).
Refer to frontmatter-options.md for the complete field catalog, model selection table,
and tool selection framework.
Violations:
Bash when a scoped pattern (Bash(git:*)) would work?allowed-tools it never uses? Dead entries add noise.Gaps:
Does the skill invoke other skills or spawn agents without Skill or Task in allowed-tools?
Does the skill require user decisions mid-workflow but lacks AskUserQuestion in allowed-tools?
Does the skill read a file path from $1 but uses a Read tool call instead of @$1
inline injection? A tool round-trip is being wasted. Minor.
Could real-time data (git status, env vars, file tree) be injected using dynamic content syntax (bang + backtick-wrapped command) instead of a tool call? Major when the skill's workflow begins with infallible probes (git branch, file tree, env vars) that never need error handling; Minor for commands that may fail or need exit-code branching.
Before (wastes tool round-trips):
1. Run `git log --oneline -5` using Bash
2. Run `git diff --name-only` using Bash
3. Analyze the results...
After (injected at invocation, zero tool calls):
- Recent commits: !`git log --oneline -5`
- Changed files: !`git diff --name-only`
Analyze the results...
A rule stated with its reasoning generalizes to every input. An example that implies a rule requires the reading model to reverse-engineer the rule — two reasoning hops instead of one, covering only the shape of that example.
Violations:
Gaps:
Load ${CLAUDE_PLUGIN_ROOT}/skills/create-skill/references/script-patterns.md before auditing this dimension. It contains
the five signal patterns for recognizing a script candidate, CLI design conventions,
common archetypes (init, validate, transform, package, query), and the delegation
pattern for using create-cli to design the interface.
Skills mix LLM-guided reasoning (agentic) and script execution (deterministic). The split
should be deliberate — see the Degrees of Freedom table in skill-anatomy.md.
Violations:
scripts/.
Inlining costs context tokens on every run; scripts execute without being loaded.Gaps — apply the five signal patterns from script-patterns.md to each workflow step:
--help text for right now? → CLI candidate; delegate design to create-cli.Every token in SKILL.md is loaded into context when the skill triggers. Audit for tokens
that consume budget without improving outcomes, and for content that belongs in
references/ instead.
Refer to the size invariants table in skill-anatomy.md to calibrate severity.
Violations:
BASE=..., BRANCH=...) used by later steps serves two
purposes: illustrating the operation AND establishing state. Collapsing it to prose
without preserving the bindings leaves downstream $VAR references unbound. When
collapsing, add a "derive working variables" preamble that explicitly binds each
variable in prose. Major per lost binding.references/. Minor.references/ deferral. Major.README.md, CHANGELOG.md, INSTALLATION.md) in
the skill directory — never loaded into context, add noise to the package. Minor per file.Gaps:
references/ file reduce SKILL.md size? Identify sections only needed for
specific sub-tasks and flag them as deferral candidates. Major if SKILL.md > 300 lines.references/ file for domain-specific data help? Lookup tables, option
catalogs, field definitions — these are reference data, not instructions. Major.A skill's process should be sequential, complete, and have explicit exit conditions at each phase. Audit for broken workflow and for missing structure that would help.
Violations:
Gaps:
$VAR referenced in a step have an explicit binding
in an earlier step or a pre-flight/preamble section? Scan all $VARNAME tokens in the
skill body and trace each back to its origin. An unbound variable is a workflow break —
the agent either halts on an invalid command or silently substitutes an empty string.
Major per unbound variable.examples/ directory help users understand what the expected output looks
like? Minor.Refer to skill-anatomy.md for the gold standard directory anatomy and the Gap Analysis
Checklist. This dimension asks: does the skill's structure match its complexity tier, and
what is absent that would raise it?
Use the Gap Analysis Checklist from skill-anatomy.md directly. For each "yes"
answer, record a gap at the appropriate severity.
Violations:
scripts/ directory with scripts not referenced in SKILL.md?
Major — referenced or delete.references/ directory with files not pointed to from SKILL.md?
Major — referenced or delete.Gaps — ask for each absent directory:
scripts/: Is there a deterministic operation that would be more reliable
scripted? Does the same code block appear or would it appear in multiple invocations?references/: Does SKILL.md exceed 300 lines? Are there sections only
needed for specific sub-tasks? Is there domain-specific reference data?examples/: Does the skill produce output users adapt? Are there ambiguous
instructions a working example would clarify better than prose?Present findings as a structured report. Split violations from gaps — a violation is something wrong, a gap is something missing that would improve the skill.
SKILL IMPROVEMENT REPORT: <skill-name>
Current tier: [simple / standard / complex] — [lines] lines, [directories present]
VIOLATIONS
──────────
CRITICAL
[D1] Description uses first-person — routing model reads as instruction, not trigger.
Fix: rewrite as "This skill should be used when the user asks to..."
MAJOR
[D3] Body teaches frontmatter quality by bad/good contrast; principle never stated.
Fix: state the rule ("quoted phrases must be verbatim user speech because routing
matches on literal tokens") then keep the contrast as confirmation.
[D5] "When to Use This Skill" section in body — dead tokens every invocation.
Fix: move routing guidance to frontmatter description, delete body section.
MINOR
[D1] Description uses | scalar instead of >.
Fix: change to >.
GAPS (what would improve this skill)
─────────────────────────────────────
MAJOR
[D7] SKILL.md is 420 lines with no references/ directory. Three sections (option catalog,
field definitions, examples table) are only needed for specific sub-tasks.
Improvement: extract to references/; add load pointer in SKILL.md for each.
[D4] File-path validation logic is inlined but must produce consistent output.
Improvement: move to scripts/validate-input.py; reference from Phase 2.
MINOR
[D2] Skill reads $1 as a file path but uses Read tool — @$1 injection would save a
tool round-trip.
Improvement: replace Read call with @$1 inline injection.
[D7] No examples/ directory; skill produces config output users adapt.
Improvement: add examples/ with one representative output file.
Group violations by severity, then gaps by severity. For each: dimension code, what is wrong or missing, the principle or gold standard it falls short of, the exact fix.
Ask: "Apply all critical and major items? Or select specific ones?"
Apply confirmed items in order: critical violations → major violations → major gaps → minor violations → minor gaps.
For each item:
After applying improvements, briefly explain:
hooks left unset — no lifecycle validation needed"Phase 4 is complete when all confirmed items are applied, the explanation is delivered, and the validation checklist passes.
After applying all improvements, load ${CLAUDE_PLUGIN_ROOT}/skills/repair-skill/references/quality-checklist.md
and run the quality standards check followed by the item-by-item validation checklist.
Report any failing items before delivering final results.