From crystools-skills
Analyzes skill executions from conversation friction, file diffs, user feedback, diagnostics, and lessons to propose concrete improvements to SKILL.md files for efficiency.
npx claudepluginhub crystian/skillsThis skill uses the workspace's default tool permissions.
> Born from real-world production usage across multiple projects. Every diagnostic category, every proposal flow, and every guardrail exists because it solved a real problem in a real skill.
Analyzes and refines skills by identifying issues like time estimates, oversized files, poor structure, redundant content; prioritizes fixes (MUST/SHOULD/NICE); implements improvements with user feedback.
Autonomously optimizes Claude Code skills by iteratively running them on test inputs, scoring against binary evals, reflecting on failures to mutate prompts, and archiving improvements. Invoke via /auto-optimize for skill enhancement or autoresearch.
Optimizes SKILL.md files for Claude Code by scoring across 8 dimensions (YAML completeness, triggers, structure, clarity, tests), proposing improvements, testing changes, and git-reverting non-improvements.
Share bugs, ideas, or general feedback.
Born from real-world production usage across multiple projects. Every diagnostic category, every proposal flow, and every guardrail exists because it solved a real problem in a real skill.
Kaizen (改善) for AI agent skills. Observe how a skill performed, find what went wrong or could be better, and propose concrete changes to its SKILL.md.
Diagnoses root causes and proposes improvements — you decide each one. Tracks recurrence in LESSONS.md with automatic importance escalation.
/skill-optimizer (default) — enter listening mode. Output ONLY this single line:
"skill-optimizer is observing the conversation, waiting for a skill to complete..."
Nothing else — no explanations, no additional context, no prompts. Be silent.
Then wait — do not prompt or block. The user will manually invoke
/skill-optimizer --review when ready to analyze.
This is the ideal scenario — the optimizer observes the skill in real time.
/skill-optimizer <name> — target a specific skill by name
/skill-optimizer --diagnose or /skill-optimizer <name> --diagnose — run static
diagnostic directly on the SKILL.md without prior observation. Skips conversation
friction and file diffs — uses static diagnostic + user feedback only.
If no target can be resolved, ask the user: "Which skill do you want to diagnose?
Provide the name or path."
/skill-optimizer --review — skip to accumulated lessons (no skill execution needed).
If no target can be resolved (no name, no prior skill in conversation), ask the
user: "Which skill do you want to review? Provide the name or path."
If multiple skills were executed in this conversation, ask the user:
"Multiple skills detected — which one do you want to review?"
If no LESSONS.md exists, inform the user: "No accumulated lessons found —
static diagnostic and user feedback will still run. Run /skill-optimizer
after a skill execution to start collecting lessons."
/skill-optimizer <name> --review — review accumulated lessons for the named skill.
Combines target resolution with --review mode. If no LESSONS.md exists,
apply the same fallback: inform the user and offer a static diagnostic.
Argument order does not matter — --review <name> is equivalent to <name> --review.
--diagnose and --review are mutually exclusive — if both are provided, inform the
user: "Cannot use --diagnose and --review together. Pick one." and stop.
Once resolved, read the target's SKILL.md and LESSONS.md (if exists).
Skill resolution: Search for <name>/SKILL.md in these paths (first match wins):
.claude/skills/.agents/skills/../other-skill/SKILL.md)If not found in any path, tell the user: "Could not find <name>/SKILL.md.
Check the skill name or provide the full path." Do not guess or search outside
these paths.
Path input: If <name> contains / (e.g., ./my-skill, ../other-skill,
or an absolute path), treat it as a direct path — read <name>/SKILL.md (or
<name> if it already ends in SKILL.md). Skip the 4-path search. If the file
does not exist, report: "File not found at <path>. Check the path and try again."
Extra arguments: Any arguments beyond <name>, --review, or --diagnose
are ignored. Inform the user: "Extra arguments ignored: [args]."
Self-optimization: The default mode is observation (conversation friction +
file diffs), but when the target is skill-optimizer itself, self-observation is
unreliable — skip conversation friction and file diffs, fall back to static
diagnostic + user feedback instead.
Fallback without prior run: If the target skill was not executed in this
conversation (e.g., /skill-optimizer <name> without prior run), fall back to
static diagnostic + user feedback. State this to the user before proceeding.
Collect findings from the appropriate source:
Sources: conversation friction, file diffs, user feedback, static diagnostic
--review<name> without prior run — static diagnostic, user feedback<name> with prior run — conversation, diffs, static diagnostic, user feedback--diagnose — static diagnostic, user feedback--review — conversation, diffs, existing LESSONS.md entries, static diagnostic, user feedbackConversation friction:
File diffs:
Use git diff (or git diff --cached) to inspect changes made during the
skill's execution. If not in a git repo, compare file contents against the
SKILL.md's expected output. Look for:
User feedback: After gathering findings, ask the user: "Want to add anything, or should we review the findings?"
Static diagnostic (used in --review, --diagnose, and <name> without prior run):
Validate against baseline rules:
name and description (required)references/ for overflow.
Token count is the primary constraint; line count is a quick heuristic.references/: this is a subdirectory alongside SKILL.md
that holds supporting material (tables, examples, templates) the agent loads
on demand. Files should be markdown, named descriptively (e.g.,
references/diagnostic-tables.md), and referenced from the body with
explicit load instructions (e.g., "Read references/diagnostic-tables.md
for the full list").name,
description), report it as a high finding and propose adding the
missing structure — infer values from the body content.### 1., Step N:, Phase N, ordered markdown
lists within execution sections). If found, enumerate them in the diagnostic
output:
Tasks detected (N):
1. [step title or summary]
2. [step title or summary]
...
Then check: does the skill instruct the agent to report progress per step?
If not, and the harness provides task management tools (e.g., TaskCreate/
TaskUpdate in Claude Code), propose adding an instruction like:
"Create a task per step at the start of execution and update each task's
status as it completes, so the user can track progress."
Importance: medium. Diagnostic: missing instruction.Cross-reference against the SKILL.md. Read references/diagnostic-tables.md
(relative to this skill's own directory, NOT the project CWD) for the category
and root-cause lookup tables.
Static diagnostic output: Present results as a single unified checklist before proposals:
Importance: high (breaks output, errors) · medium (suboptimal, friction) · low (style, preferences)
Recurrence: Same pattern in LESSONS.md? Increment Hits instead of duplicating.
Hits >= 3 escalates: low → medium, medium → high.
context7 (optional): If the target skill references a specific
library or framework (e.g., Angular, NestJS, React), use context7
to verify that code patterns, API calls, or config examples in the
SKILL.md match current docs. Report mismatches as findings with
diagnostic outdated content. If the skill does not reference any
library, skip — the baseline rules are sufficient. If context7 is not
available in the environment, skip this step — the baseline rules are
sufficient.
If no findings were identified, report: "No issues found. The skill looks solid."
If user feedback was not yet collected (e.g., --review mode), ask for it
now. If the user has none, end with no proposals.
Before the first proposal, show the sources legend listing only the
sources actually used in the current run (e.g.,
Sources: static diagnostic, user feedback).
Present findings one at a time, ordered by importance.
For --review findings from LESSONS.md, verify the root cause still exists
in the current SKILL.md before proposing. If resolved, mark as
(resolved) and recommend rejecting to clean up the entry.
If there are more than 15 findings, present the top 15 (by importance) and
offer to log the rest to LESSONS.md: "There are [N] more remaining
findings — want me to log them to LESSONS.md for later review?"
If the user declines, discard the remaining findings.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PROPOSAL [N/total] — [importance]
Source: [conversation | diff | user | lessons | diagnostic]
Finding: [what was observed]
Root cause: [diagnostic] — [which line/section and why]
Hits: [N — omit if first occurrence]
Proposed change: [what to add/modify/remove]
Preview:
- [old line]
+ [new line]
For additions, show only `+` lines with surrounding context.
For removals, show only `-` lines.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Use the question tool (e.g., AskUserQuestion in Claude Code) to ask the user for their decision:
Default mode:
--review mode:
Actions:
--review):
y, append a negative rule at the end of the target's SKILL.md Guardrails
section (create one if absent — place it as the last section before
any footer like ---) using the format:
- **Never [action].** [reason from the finding]If the user selects "Other" and types "skip all", write current and all remaining findings to LESSONS.md and end.
Summary: Done. [N] accepted, [N] postponed, [N] rejected, [N] don'ts.
Lives alongside the target's SKILL.md.
Read references/lessons-format.md (relative to this skill's own directory,
NOT the project CWD) for the full format and rules.
sk-...,
ghp_..., Bearer ...) with [REDACTED] in all output and LESSONS.md.Made with <3 by Crystian