From aradotso-trending-skills-37
Optimizes SKILL.md files for Claude Code by scoring across 8 dimensions (YAML completeness, triggers, structure, clarity, tests), proposing improvements, testing changes, and git-reverting non-improvements.
npx claudepluginhub joshuarweaver/cascade-ai-ml-agents-misc-1 --plugin aradotso-trending-skills-37This skill uses the workspace's default tool permissions.
```markdown
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
---
name: darwin-skill-optimizer
description: Autonomous skill optimization loop for Claude Code — evaluates, improves, tests, and ratchets SKILL.md files using an autoresearch-inspired evolutionary cycle.
triggers:
- optimize my skills
- improve all skills
- run darwin skill optimizer
- evaluate and improve skill files
- optimize a specific skill
- run skill optimization loop
- apply darwin evolution to skills
- ratchet my agent skills
---
# darwin-skill — Autonomous SKILL.md Optimizer
> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.
darwin-skill brings Andrej Karpathy's `autoresearch` loop to Agent Skill optimization. It evaluates every SKILL.md file across 8 weighted dimensions (100 pts total), proposes targeted improvements, tests them, and keeps only changes that measurably improve the score — a ratchet that never goes backwards.
---
## Installation
```bash
# via npx (recommended)
npx skills add alchaincyf/darwin-skill
# manual (no GitHub access)
# Download: https://pub-161ae4b5ed0644c4a43b5c6412287e03.r2.dev/skills/darwin-skill.zip
# Unzip → place SKILL.md at: ~/.claude/skills/darwin-skill/SKILL.md
Compatible with: Claude Code, Cursor, Codex, Trae, CodeBuddy, OpenClaw — any agent that reads ~/.claude/skills/ or equivalent.
┌─────────────┐ ┌──────────────┐ ┌───────────────┐
│ Phase 1 │────▶│ Phase 2 │────▶│ Phase 3 │
│ Inventory │ │ Optimize │ │ Report │
│ + Score │ │ (ratchet) │ │ + Confirm │
└─────────────┘ └──────────────┘ └───────────────┘
│ │
│ ┌─────▼──────┐
│ │ score(new) │
│ │ > score(old│
│ │ keep / rev │
│ └────────────┘
Key guarantee: Every skill's score can only increase. Any change that doesn't improve the score is automatically git reverted.
Once installed, speak naturally to your agent:
"optimize all skills"
"optimize the darwin-skill skill"
"run the darwin optimization loop on my nuwa skill"
"evaluate all my skill files and improve the weakest ones"
The agent will:
~/.claude/skills/| Dimension | Max | Method |
|---|---|---|
| YAML frontmatter completeness | 10 | Static analysis |
| Trigger phrase quality | 10 | Static analysis |
| Structure & headings | 10 | Static analysis |
| Code example quality | 15 | Static analysis |
| Clarity & conciseness | 15 | Static analysis |
| Real-world task coverage | 10 | Live test |
| Output correctness | 15 | Live test |
| Agent usability | 15 | Live test |
Static analysis = 60 pts. Live testing = 40 pts. A beautiful skill with poor runtime output scores low.
Round 1: baseline = 65
Round 2: proposal scores 75 → KEEP (baseline = 75)
Round 3: proposal scores 71 → REVERT (baseline stays 75)
Round 4: proposal scores 82 → KEEP (baseline = 82)
Implementation (what the agent does internally):
# Before each improvement attempt
git add skills/<name>/SKILL.md
git commit -m "darwin: pre-improvement snapshot (<name>)"
# Apply the targeted edit to SKILL.md...
# Re-score with an isolated sub-agent
NEW_SCORE=$(run_scoring_agent skills/<name>/SKILL.md)
if [ "$NEW_SCORE" -gt "$BASELINE_SCORE" ]; then
git add skills/<name>/SKILL.md
git commit -m "darwin: improve <name> (+$DELTA pts → $NEW_SCORE)"
echo "✅ Kept: $BASELINE_SCORE → $NEW_SCORE"
else
git revert HEAD --no-edit
echo "⏪ Reverted: $NEW_SCORE < $BASELINE_SCORE"
fi
The agent scans all skills and produces a ranked table:
Skill Score Weakest Dimension
──────────────────────────────────────────────
nuwa-skill 88/100 Code examples (11/15)
darwin-skill 76/100 Live test coverage (8/15)
my-custom-skill 54/100 Trigger phrases (4/10)
For each skill (lowest score first):
y/n## Darwin Optimization Report
| Skill | Before | After | Delta |
|-----------------|--------|-------|-------|
| nuwa-skill | 88 | 92 | +4 |
| darwin-skill | 76 | 83 | +7 |
| my-custom-skill | 54 | 61 | +7 |
Total improvement: +18 pts across 3 skills
Reverted attempts: 2
Darwin uses a test prompt file to validate live behavior. Place it alongside your skill:
// ~/.claude/skills/<name>/test-prompts.json
{
"skill": "my-custom-skill",
"prompts": [
{
"id": "basic-usage",
"input": "show me how to initialize this project",
"expect_contains": ["npm install", "import", "config"],
"expect_not_contains": ["TODO", "placeholder"]
},
{
"id": "error-handling",
"input": "how do I handle auth errors in this library",
"expect_contains": ["try", "catch", "401"],
"weight": 1.5
}
]
}
~/.claude/skills/
├── darwin-skill/
│ └── SKILL.md ← this skill
├── nuwa-skill/
│ ├── SKILL.md ← skill to optimize
│ └── test-prompts.json ← optional live tests
└── my-other-skill/
├── SKILL.md
└── test-prompts.json
Weak trigger phrases → improved:
# Before
triggers:
- use the tool
- help me
# After
triggers:
- initialize a new project with this library
- configure authentication for my app
- show me how to handle errors
- debug connection timeout issues
Missing code examples → added:
# Before
Use the `connect()` method to establish a connection.
# After
Use `connect()` with your credentials:
```typescript
import { Client } from 'my-lib';
const client = new Client({
url: process.env.SERVICE_URL,
token: process.env.SERVICE_TOKEN,
});
await client.connect();
**Vague troubleshooting → made actionable:**
```markdown
# Before
If something goes wrong, check your config.
# After
**"Connection refused" errors**
- Verify `SERVICE_URL` is set: `echo $SERVICE_URL`
- Check firewall allows port 443
- Test with: `curl -I $SERVICE_URL/health`
| Principle | What it means |
|---|---|
| Single editable asset | Only one SKILL.md changes per round — improvements are attributable |
| Dual evaluation | Static analysis (structure) + live testing (behavior) |
| Ratchet | Score can only increase; regressions auto-revert |
| Independent scoring | Sub-agent scores, not the same agent that wrote the change |
| Human in the loop | Pauses between skills; you confirm or skip |
git reverttest-prompts.json score 0 on live dimensions unless the agent can infer test cases from the skill content| autoresearch | darwin-skill |
|---|---|
program.md (defines goal) | This SKILL.md |
train.py (optimized asset) | Each target SKILL.md |
val_bpb loss metric | 8-dimension weighted score (100 pts) |
git ratchet | keep / revert per round |
| Test set | test-prompts.json |
| Fully autonomous | Human-in-loop (skill quality is subtler than loss) |
darwin-skill optimizes skills. nuwa-skill creates them from scratch.
# Create new skills with nuwa, then evolve them with darwin
npx skills add alchaincyf/nuwa-skill
npx skills add alchaincyf/darwin-skill
Workflow:
nuwa → generates initial SKILL.md from a repo URL or descriptiondarwin → runs the optimization loop, ratchets quality upward"No skills found"
~/.claude/skills/<name>/SKILL.md"optimize the skill at ./my-project/SKILL.md"Score not improving after many rounds
"regenerate this skill from scratch with nuwa, then re-optimize"test-prompts.json exists; live test scores (40 pts) are the largest leverGit revert fails
cd ~/.claude/skills && git initSub-agent scoring seems inconsistent
> improvement, so ties revert safely