From skill-maker
Iteratively improves a skill folder based on feedback, running baseline review, diagnosing issues like triggers or output quality, and applying targeted fixes.
npx claudepluginhub vcode-sh/vibe-tools --plugin skill-maker# Skill Improvement Loop Take an existing skill and iteratively improve it based on real-world feedback. Follows the Anthropic best practice: "Iterate on a single task until Claude succeeds, then extract the winning approach." ## Input The user provides: `$ARGUMENTS` Parse the input: - Path to the skill folder - Optional: description of issues, test results, or feedback If path is missing, ask: "Which skill should I improve? Provide the path to the skill folder." ## Step 1: Understand Current State Read the skill and gather context: 1. Read SKILL.md and all supporting files 2. Run a ...
/skill-improverIteratively reviews and fixes a Claude Code skill at the given path or name until it meets quality standards. Resolves path automatically and supports --max-iterations N.
/eval-evolveEvolves SKILL.md via multi-round Ralph loop chaining /eval-skill → /eval-improve until target score or max iterations. Installs ralph-wiggum plugin if needed.
/skill-evalTests a skill with parallel with-skill and baseline eval runs across multiple prompts, grades against assertions, and iterates on improvements.
/improve-skillsAnalyzes LEARNINGS.md for skill improvement opportunities from execution logs, failure rates, and user evaluations; prioritizes by frequency-impact-ease score; proposes and optionally implements fixes with validation.
/skill-craftCreates skills through checklist-driven phases (intake, design, build, test, ship) or reviews skills/agent prompts with quality gates.
/improveImproves code quality, performance, maintainability, or style in a target via analysis, multi-persona refactoring, validation, and documentation. Supports [target], --type, --safe, --interactive flags.
Share bugs, ideas, or general feedback.
Take an existing skill and iteratively improve it based on real-world feedback. Follows the Anthropic best practice: "Iterate on a single task until Claude succeeds, then extract the winning approach."
The user provides: $ARGUMENTS
Parse the input:
If path is missing, ask: "Which skill should I improve? Provide the path to the skill folder."
Read the skill and gather context:
/skill-maker:review) to establish baseline scoreCURRENT STATE: [skill-name]
═══════════════════════════
Grade: [A-F]
Description quality: [X/10]
Key strengths: [list]
Key weaknesses: [list]
If the user provided issue descriptions, use those. Otherwise, ask:
"What problems are you experiencing with this skill? Select what applies or describe your own:"
Offer common improvement areas:
For each reported issue, determine the root cause:
## Critical headerFor each diagnosed issue:
Track all changes:
CHANGES APPLIED:
1. [File]: [What changed] - [Why]
2. [File]: [What changed] - [Why]
3. [File]: [What changed] - [Why]
After all improvements:
IMPROVEMENT RESULTS: [skill-name]
══════════════════════════════════
Before → After
Grade: [X] → [Y]
Description: [X/10] → [Y/10]
Structure: [X/6] → [Y/6]
Body Quality: [X/9] → [Y/9]
Changes made: [X] modifications across [Y] files
After improving, automatically generate triggering tests for the changed areas:
VERIFY THESE IMPROVEMENTS:
If you fixed triggers, test these queries:
1. "[query that previously failed]" → Should now trigger: YES/NO
2. "[query that over-triggered]" → Should now NOT trigger: YES/NO
If you fixed output, test this scenario:
- Input: [the scenario that was failing]
- Expected: [what should now happen]
Improvement round complete. Options:
1. Run /skill-maker:test to generate a full test suite
2. Continue improving (provide more feedback)
3. Done - the skill is ready