Skill

cpf-skill-creator

Create and improve Claude Code skills using a structured checkpointflow workflow with test-driven iteration. Use this skill whenever the user wants to create a skill, make a skill, build a skill, improve an existing skill, test a skill with evals, benchmark skill quality, optimize a skill description, or turn a conversation into a reusable skill. Also triggers on "skill for X", "automate this as a skill", "package this as a skill", and similar phrases.

From cpf

Install

Run in your terminal

npx claudepluginhub thomasrohde/marketplace --plugin cpf

Tool Access

This skill uses the workspace's default tool permissions.

Supporting Assets

View in Repository

skill-creator.yaml

Skill Content

Similar Skills

inventory-demand-planning

Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.

everything-claude-code

141.5k

investor-materials

Creates consistent pitch decks, one-pagers, investor memos, financial models, accelerator apps, and fundraising materials from a single source of truth.

everything-claude-code

141.5k

energy-procurement

Provides expertise on electricity/gas procurement, tariff optimization, demand charge management, renewable PPA evaluation, hedging, load profiling, and multi-facility energy strategies.

everything-claude-code

141.5k

Stats

Parent Repo Stars1

Parent Repo Forks0

Last CommitMar 30, 2026

Actions

View Source View Plugin View on GitHub View README

cpf-skill-creator

Create and improve Claude Code skills through a structured, workflow-driven process powered by checkpointflow.

Prerequisites

This skill requires:

checkpointflow (cpf CLI) — install with uv tool install checkpointflow or pip install checkpointflow
Anthropic skill-creator plugin — for eval scripts, grading agents, and the review viewer. Install with /plugin install skill-creator from any Claude Code marketplace that carries it.

How it works

This skill wraps the skill creation process in a cpf workflow that orchestrates the full lifecycle. The workflow uses audience-driven dispatch — agent steps do the work automatically, user steps pause for human decisions.

Finding the workflow

The workflow YAML is bundled alongside this SKILL.md. To locate it, find the directory containing this file and look for skill-creator.yaml:

# The workflow is at the same path as this SKILL.md:
# <this-skill-dir>/skill-creator.yaml

When running the workflow, use the absolute path to the YAML file. If this skill was loaded from the plugin cache, the path will be something like:

~/.claude/plugins/cache/<marketplace>/cpf/<version>/skills/cpf-skill-creator/skill-creator.yaml

Running the workflow

cpf run -f <path-to-skill-creator.yaml> --input '{"skill_name": "my-skill", "intent": "..."}'

Or use the cpf-workflow-runner skill which handles the interactive loop automatically.

The workflow lifecycle

Capture intent (user) — Refine what the skill does and when it triggers
Research & draft (agent) — Create the SKILL.md and test cases
Review draft (user) — Approve, revise, or cancel
Run tests & grade (agent) — Execute evals, grade, benchmark, launch viewer
Review results (user) — Examine outputs, decide to iterate/optimize/ship
Improve skill (agent) — Apply feedback, rewrite (loops back to step 4)
Optimize description (agent) — Tune triggering accuracy
Package (agent) — Create .skill file for distribution

Reference material: Anthropic skill-creator

This skill delegates to utilities from the Anthropic skill-creator plugin rather than duplicating them. Locate the skill-creator by searching the plugin cache:

# Try these locations in order:
find ~/.claude/plugins/cache -type d -name "skill-creator" -path "*/skills/*" 2>/dev/null | head -1

The skill-creator provides:

Path	Purpose
`scripts/aggregate_benchmark.py`	Aggregate grading into benchmark.json
`scripts/package_skill.py`	Package skill as .skill file
`scripts/run_loop.py`	Description optimization loop
`scripts/run_eval.py`	Run trigger evaluation
`eval-viewer/generate_review.py`	Generate HTML review viewer
`agents/grader.md`	Grading instructions for subagents
`agents/analyzer.md`	Benchmark analysis instructions
`agents/comparator.md`	Blind A/B comparison instructions
`assets/eval_review.html`	Template for trigger eval review UI
`references/schemas.md`	JSON schemas for evals, grading, benchmark

When running Python scripts, set the working directory to the skill-creator root so module imports resolve:

cd <skill-creator-path>
python -m scripts.aggregate_benchmark <workspace>/iteration-1 --skill-name my-skill

If the skill-creator plugin isn't installed, fall back to manual operations — grade inline, skip the viewer, and package by hand.

Skill writing principles

These principles apply when drafting or improving skills in the agent steps:

Description field is king. It's the primary triggering mechanism. Include both what the skill does AND specific contexts/phrases. Claude tends to under-trigger, so make descriptions slightly "pushy" — enumerate trigger phrases generously.

Progressive disclosure. Keep SKILL.md under 500 lines. Put detailed reference material in references/ subdirectories. Put scripts in scripts/. The model reads SKILL.md on trigger and only loads bundled resources when needed.

Explain the why. Today's LLMs respond better to reasoning than to rigid ALWAYS/NEVER rules. Explain why each instruction matters so the model can generalize to edge cases.

Examples over abstractions. Include concrete input/output examples. They're worth more than paragraphs of description.

Look for repeated patterns. If test runs independently produce similar helper scripts, bundle that script into the skill. Write it once in scripts/ and reference it.

Iteration philosophy

When improving a skill after user feedback:

Generalize from the specific feedback. The skill will be used across many prompts, not just the test cases. Don't overfit.
Keep it lean. Remove instructions that aren't pulling their weight. Read the test transcripts — if the skill is making the model waste time, cut those parts.
Explain the why. If you find yourself writing ALL-CAPS MUST/NEVER, reframe as reasoning the model can internalize.
Draft, then revise. Write the improvement, then read it fresh and tighten it.

Workspace layout

Test results and artifacts are organized in <skill-name>-workspace/ alongside the skill:

<skill-name>-workspace/
  evals/
    evals.json                    # Test prompts and assertions
  iteration-1/
    eval-0-<descriptive-name>/
      with_skill/
        outputs/                  # Files produced with the skill
        grading.json              # Assertion pass/fail results
        timing.json               # Tokens and duration
      without_skill/
        outputs/                  # Baseline outputs
        grading.json
        timing.json
      eval_metadata.json          # Prompt, assertions for this eval
    benchmark.json                # Aggregated metrics
    benchmark.md                  # Human-readable summary
    feedback.json                 # User feedback from the viewer
  iteration-2/
    ...

Quick start for improving an existing skill

Pass the existing skill via the existing_skill_path input:

cpf run -f <path-to-skill-creator.yaml> --input '{
  "skill_name": "my-skill",
  "intent": "Improve test coverage and fix edge case handling",
  "existing_skill_path": ".claude/skills/my-skill"
}'

The workflow will snapshot the existing skill for baseline comparison, then iterate.

Tips

The workflow handles orchestration; each agent step does the actual work (reading code, writing files, running tools).
Always use generate_review.py from the skill-creator plugin — never write custom HTML for the viewer.
For headless environments, pass --static <output.html> to generate_review.py.
The --previous-workspace flag on generate_review.py enables iteration-over-iteration comparison.
If subagents aren't available, run test cases inline (less rigorous but still useful with human review).
Description optimization requires the claude CLI (claude -p). Skip it if not available.