Use after reflect to classify learnings, update metrics, and propose agent evolution, or when user says "evolve", "self-improve", "classify learnings"
From harnessnpx claudepluginhub donovan-yohan/belayer --plugin harness/evolveAnalyzes instincts, clusters related ones by patterns, and suggests evolutions into commands, skills, or agents. Use --generate to create files in evolved directories.
/evolveStarts or monitors an evolutionary development loop by reading dedicated skill instructions and processing user arguments.
/evolveChecks instinct clustering across domains and proposes evolved capabilities (commands, skills, agents) if 5+ share a domain. Creates on approval and updates identity.
/evolveEvolves skill .md files by integrating validated lessons from usage data, generating diffs for user approval. Also supports --rollback, --export, and --import.
/evolveAnalyzes project and global instincts, clusters them by trigger patterns into command/skill/agent candidates, suggests evolutions, and generates files with --generate.
/evolveReconciles spec/code mismatches using AI-guided evolution with full user control over the process.
Self-modification phase. Classifies learning scope, updates metrics from review results, proposes agent evolution based on evidence, and auto-applies safe changes. Run after /harness:reflect, before /harness:complete.
/harness:evolve # Run full evolution cycle
Requires a .harness/ runtime directory. Resolve it:
HARNESS_DIR=$(bash ${CLAUDE_PLUGIN_ROOT}/scripts/harness-resolve-dir.sh --repo-root .)
If HARNESS_DIR is empty, STOP and print:
No .harness/ runtime found. Run /harness:init and choose option 1 or 2 to enable self-improvement.
IMMEDIATELY execute this workflow:
Read docs/LEARNINGS.md. Filter to entries written in this session (match current date and current branch in source: and branch: fields).
For each learning from this session that does NOT already have a scope: field:
Classify as scope: repo or scope: universal using these rules:
Default to repo (conservative). Only promote to universal when ALL of:
Examples:
activity.RecordHeartbeat, wrap in recover()" → repo (references Temporal, a specific dependency)universal (general pattern)repo (references specific module)universal (general language pattern)Write the scope tag into each learning entry as a new metadata line after category:
- scope: {repo|universal}
Report classification results:
Classified {N} learnings: {R} repo, {U} universal
Read $HARNESS_DIR/review-results.json (written by /harness:review). If the file doesn't exist, skip to Phase 3 — no review data to process.
Read $HARNESS_DIR/metrics/review-effectiveness.json. For each agent in review-results.json:
Count findings that were accepted (led to code changes) vs dismissed:
findings = count of entries where accepted: truefalse_positives = count of entries where accepted: falseunique_catches = count of entries where unique: trueUpdate metrics using the persistence script:
bash ${CLAUDE_PLUGIN_ROOT}/scripts/harness-write-metrics.sh \
--harness-dir "$HARNESS_DIR" \
--metric "review-effectiveness" \
--agent "{agent-name}" \
--findings {N} \
--false-pos {N} \
--unique {N}
If an active plan exists in docs/exec-plans/active/, update plan accuracy metrics:
- [x] and - [ ] in the Progress sectionbash ${CLAUDE_PLUGIN_ROOT}/scripts/harness-write-metrics.sh \
--harness-dir "$HARNESS_DIR" \
--metric "plan-accuracy" \
--plan-slug "{plan-slug}" \
--tasks-planned {N} \
--tasks-completed {N} \
--drift {N} \
--surprises {N}
Update $HARNESS_DIR/metrics/learning-efficacy.json:
recurrence_countprevented_countbash ${CLAUDE_PLUGIN_ROOT}/scripts/harness-write-metrics.sh \
--harness-dir "$HARNESS_DIR" \
--metric learning-efficacy \
--learning-id "{learning-id}" \
--recurrence {recurrence-increment} \
--prevented {prevented-increment}
Read $HARNESS_DIR/metrics/review-effectiveness.json. Identify:
docs/REVIEW_GUIDANCE.md escape log for entries from this sessionscope: universalIf no signals found, skip to Phase 4 report. Otherwise, for each signal:
<MANDATORY> You MUST use the Agent tool with `subagent_type: "harness:harness-evolver"` to generate proposals. The evolver agent has the semantic dedup check and line budget enforcement logic. Do NOT generate proposals inline — the evolver agent's methodology prevents bloated agents.Example invocation:
Agent(
subagent_type="harness:harness-evolver",
prompt="Generate evolution proposals for these signals: {signal list}. Read agent definitions from $HARNESS_DIR/agents/. Read metrics from $HARNESS_DIR/metrics/. Output proposals in the structured Output Format — do NOT write files directly."
)
</MANDATORY>
For each proposal the evolver produces, write it to disk:
bash ${CLAUDE_PLUGIN_ROOT}/scripts/harness-write-proposal.sh \
--harness-dir "$HARNESS_DIR" \
--slug "{slug}" \
--scope "{repo|universal}" \
--signal "{signal-source}" \
--agent "{agent-name}" \
--current-file "{temp-file}" \
--proposed-file "{temp-file}" \
--reasoning-file "{temp-file}"
Read $HARNESS_DIR/config.yaml. Check evolve.auto_apply and evolve.min_runs_for_auto.
For each proposal from Phase 3, determine auto-apply eligibility. A proposal is auto-applied when ALL criteria are met:
evolve.auto_apply is true in config.yamlevolve.min_runs_for_auto (default: 5)Before applying any changes, snapshot the current metrics to enable rollback comparison:
cp "$HARNESS_DIR/metrics/review-effectiveness.json" \
"$HARNESS_DIR/metrics/pre-change-snapshot.json"
This snapshot is overwritten each evolve session. Phase 5 compares current metrics against it.
For eligible proposals:
$HARNESS_DIR/agents/{agent}.mdpending to applied$HARNESS_DIR/memory/IMPROVEMENTS.md:
### {YYYY-MM-DD}: {one-line description of change}
- **Agent:** {agent name}
- **Signal:** {signal source}
- **Change:** {what was added/modified}
- **Scope:** {repo|universal}
- **Auto-applied:** yes
- **Rollback:** none
{Reasoning from the proposal}
---
For non-eligible proposals: leave status as pending. These will be surfaced in the PR description by /harness:complete.
Check if $HARNESS_DIR/metrics/pre-change-snapshot.json exists (written in Phase 4 step 14). If it doesn't exist, skip this phase — no changes were applied to roll back.
Read $HARNESS_DIR/memory/IMPROVEMENTS.md. For each auto-applied change from the current session (proposals applied in Phase 4 above):
$HARNESS_DIR/metrics/review-effectiveness.json against pre-change-snapshot.json:
- **Rollback:** rolled-back-{YYYY-MM-DD}rolled-backThis phase only runs when there are at least 2 post-change review runs to compare against the snapshot. Skip if insufficient data. Note: rollback evaluation is scoped to the current session's proposals because pre-change-snapshot.json is overwritten each evolve run.
bash ${CLAUDE_PLUGIN_ROOT}/scripts/harness-write-run.sh \
--harness-dir "$HARNESS_DIR" \
--phase "evolve" \
--branch "$(git branch --show-current)"
## Evolve Complete
**Runtime:** {$HARNESS_DIR}
### Scope Classification
- Learnings classified: {N} ({R} repo, {U} universal)
### Metrics Updated
- Review effectiveness: {N} agents updated
- Plan accuracy: {updated | no active plan}
- Learning efficacy: {N} learnings tracked
### Evolution Proposals
- Proposals generated: {N}
- Auto-applied: {N} (signals: {list})
- Pending review: {N}
### Auto-Rollback
- Rollbacks: {N | none | skipped (insufficient data)}
## Next Step
Run `/harness:complete` to archive the plan and create the PR (proposals will be listed in the PR description).