Plugin

harness

Name: harness
Author: donovan-yohan

Self-improving documentation system with agent evolution, metrics-driven self-modification, and stateless command protocol. Brainstorm, plan, orchestrate, review, reflect, evolve, complete.

npx claudepluginhub donovan-yohan/chalk-bag --plugin harness

Component Overview

Commands

Agents

Skills

Component Details

Commands (13)

Batch

/batch

Use when executing a living plan via Claude Code's /batch tool for worktree-isolated parallel execution, or when user says "batch", "batch execute", "run batch", "parallel batch", "run the plan in parallel", "execute without checkpoints", "hands-off execution", or "use batch mode"

Brainstorm

/brainstorm

Use when starting new feature work, creative design, or when user says "brainstorm", "design a feature", or "let's think about"

Bug

/bug

Use when investigating a bug, diagnosing an error, or when user says "debug", "fix bug", "investigate issue", or "root cause"

Complete

/complete

Use when ready to archive the plan and create the PR, when user says "we're done", "complete this", "wrap up", or after /harness:reflect finishes

Evolve

/evolve

Use after reflect to classify learnings, update metrics, and propose agent evolution, or when user says "evolve", "self-improve", "classify learnings"

Init

/init

Use when initializing structured documentation for a repository, when CLAUDE.md exceeds 120 lines, or when user says "set up docs" or "initialize harness"

Orchestrate

/orchestrate

Use when executing a living plan with agent teams, or when user says "orchestrate", "execute the plan", "start building", or "run the plan"

Plan

/plan

Use when creating an implementation plan from a design doc, bug analysis, or refactor scope, or when user says "create a plan", "plan this", or "write the plan"

Prune

/prune

Use when auditing docs for staleness, broken links, or bloat. Also use when user says "docs feel stale", "prune docs", or when CLAUDE.md exceeds 120 lines.

Refactor Status

/refactor-status

Use when resuming an in-progress refactoring, checking goal progress, or when user says "refactor status", "where are we", or "continue refactoring"

Refactor

/refactor

Use when planning an incremental refactoring, extracting responsibilities from a large class, or when user says "refactor", "extract", "strangler fig", or "decompose"

Reflect

/reflect

Use when capturing learnings and updating docs after review, when user says "reflect", "retrospective", "update docs", or after /harness:review completes

Review

/review

Use when implementation is done and code needs quality review, when user says "review the code", "check the code", or after /harness:orchestrate completes all tasks

Agents (3)

harness-evolver

/harness-evolver

Use when proposing modifications to agent definitions based on review escapes, metric anomalies, or universal learnings — invoked by /harness:evolve Phase 3

harness-pruner

/harness-pruner

Use when auditing documentation health, finding stale or orphaned guides, checking CLAUDE.md bloat, or when /harness:prune is invoked

learnings-reviewer

/learnings-reviewer

Use during /harness:review Phase 4 to check code changes against active learnings from docs/LEARNINGS.md for violations

Skills (1)

strangler-fig

/strangler-fig

Use when breaking up a god class, extracting responsibilities from an oversized module, incrementally replacing legacy code, or when old and new must coexist during a refactor — symptoms include classes over 500 lines, modules with mixed concerns, or "we can't rewrite this all at once"

README

chalk-bag

Reusable agent skills for structured software development workflows. Two plugins:

harness — Documentation lifecycle management with self-improving review agents
pr — Pull request lifecycle automation

Plugins

harness

A 3-tier documentation system with living execution plans, adversarial code review, conversation mining, and self-improving agent evolution.

Workflow: brainstorm -> plan -> orchestrate -> review -> reflect -> evolve -> complete

Command	Purpose
`/harness:init`	Initialize 3-tier documentation structure
`/harness:brainstorm`	Design through collaborative dialogue
`/harness:bug`	Systematic bug investigation with architecture review
`/harness:refactor`	Scope incremental refactoring with strangler fig patterns
`/harness:plan`	Create living execution plans from design docs
`/harness:orchestrate`	Execute plans with agent teams and micro-reflects
`/harness:batch`	Execute plans via worktree-isolated parallel batch
`/harness:review`	Multi-agent code review with adversarial production review
`/harness:evolve`	Classify learnings, update metrics, propose agent evolution
`/harness:reflect`	Full reflection, conversation mining, retrospective
`/harness:complete`	Archive plan, prune check, and create PR
`/harness:prune`	Audit docs for staleness, broken links, bloat

Agents: harness-pruner, learnings-reviewer, harness-evolver

Skills: strangler-fig (incremental refactoring patterns)

pr

Pull request lifecycle management with multi-perspective automated review.

Command	Purpose
`/pr:author`	Create PRs with quality gates
`/pr:automate`	Full automated lifecycle: author -> review -> resolve -> merge
`/pr:review`	Multi-agent PR review (6 specialized agents)
`/pr:resolve`	Analyze and address PR review comments
`/pr:update`	Sync PR description with current changes

Installation

Claude Code

Install as a marketplace. From any project directory:

claude /plugins add https://github.com/donovan-yohan/chalk-bag

This registers both plugins. You can also install individual plugins:

# Install just the harness plugin
claude /plugins add https://github.com/donovan-yohan/chalk-bag/plugins/harness

# Install just the pr plugin
claude /plugins add https://github.com/donovan-yohan/chalk-bag/plugins/pr

Hermes

Hermes can load these skills through its plugin and profile system. See docs/hermes.md for integration instructions.

Other Agents

The command and skill files are standard markdown with YAML frontmatter. Any agent that can load markdown-based instructions can use these skills directly. The key integration points:

Commands (plugins/*/commands/*.md) — Procedural workflows triggered by explicit user commands
Agents (plugins/*/agents/*.md) — System prompts for specialized background agents
Skills (plugins/*/skills/*/SKILL.md) — Pattern-triggered capabilities with frontmatter descriptions
Scripts (plugins/harness/scripts/*.sh) — Shell scripts for persistence and metrics
References (plugins/harness/references/*.md) — Shared reference documents loaded by commands

Dependencies

The harness plugin works standalone. Some commands optionally integrate with:

superpowers — Used by brainstorm, plan, orchestrate, review, and complete commands for their core methodologies (brainstorming, writing-plans, subagent-driven-development, etc.)
pr-review-toolkit — Used by the review command for specialized review agents (code-reviewer, silent-failure-hunter, type-design-analyzer)

The pr plugin requires pr-review-toolkit for the review and automate commands.

Documentation Tiers

The harness plugin manages a 3-tier documentation system:

CLAUDE.md (60-120 lines) — Map with Documentation Map table
docs/*.md — Domain summaries (ARCHITECTURE, DESIGN, PLANS, QUALITY)
docs/design-docs/, docs/exec-plans/ — Deep docs, versioned plans

Self-Improvement

The harness plugin tracks review effectiveness across sessions:

Metrics — Review agent accuracy, plan prediction quality, learning efficacy
Evolution — Automatic agent definition updates based on review escapes
Adversarial review — Context-isolated production failure analysis using claude -p
Learnings — Persistent project knowledge captured and enforced across sessions

License

MIT

Similar Plugins

vibeworks-library

Complete developer workflow toolkit. Includes 34 reference skills, 34 specialized agents, and 21 slash commands covering TDD, debugging, code review, architecture, documentation, refactoring, security, testing, git workflows, API design, performance, UI/UX design, plugin development, and incident response. Full SDLC coverage with MCP integrations.

Stats

Version5.0.1

Parent Repo Stars0

MaintenanceGood

AddedApr 16, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

chalk-bag

Safety Signals

Caution

Uses power tools

Uses Bash, Write, or Edit tools

chalk-bag

Reusable agent skills for structured software development workflows. Two plugins:

harness — Documentation lifecycle management with self-improving review agents
pr — Pull request lifecycle automation

Plugins

harness

A 3-tier documentation system with living execution plans, adversarial code review, conversation mining, and self-improving agent evolution.

Workflow: brainstorm -> plan -> orchestrate -> review -> reflect -> evolve -> complete

Command	Purpose
`/harness:init`	Initialize 3-tier documentation structure
`/harness:brainstorm`	Design through collaborative dialogue
`/harness:bug`	Systematic bug investigation with architecture review
`/harness:refactor`	Scope incremental refactoring with strangler fig patterns
`/harness:plan`	Create living execution plans from design docs
`/harness:orchestrate`	Execute plans with agent teams and micro-reflects
`/harness:batch`	Execute plans via worktree-isolated parallel batch
`/harness:review`	Multi-agent code review with adversarial production review
`/harness:evolve`	Classify learnings, update metrics, propose agent evolution
`/harness:reflect`	Full reflection, conversation mining, retrospective
`/harness:complete`	Archive plan, prune check, and create PR
`/harness:prune`	Audit docs for staleness, broken links, bloat

Agents: harness-pruner, learnings-reviewer, harness-evolver

Skills: strangler-fig (incremental refactoring patterns)

pr

Pull request lifecycle management with multi-perspective automated review.

Command	Purpose
`/pr:author`	Create PRs with quality gates
`/pr:automate`	Full automated lifecycle: author -> review -> resolve -> merge
`/pr:review`	Multi-agent PR review (6 specialized agents)
`/pr:resolve`	Analyze and address PR review comments
`/pr:update`	Sync PR description with current changes

Installation

Claude Code

Install as a marketplace. From any project directory:

claude /plugins add https://github.com/donovan-yohan/chalk-bag

This registers both plugins. You can also install individual plugins:

# Install just the harness plugin
claude /plugins add https://github.com/donovan-yohan/chalk-bag/plugins/harness

# Install just the pr plugin
claude /plugins add https://github.com/donovan-yohan/chalk-bag/plugins/pr

Hermes

Hermes can load these skills through its plugin and profile system. See docs/hermes.md for integration instructions.

Other Agents

The command and skill files are standard markdown with YAML frontmatter. Any agent that can load markdown-based instructions can use these skills directly. The key integration points:

Commands (plugins/*/commands/*.md) — Procedural workflows triggered by explicit user commands
Agents (plugins/*/agents/*.md) — System prompts for specialized background agents
Skills (plugins/*/skills/*/SKILL.md) — Pattern-triggered capabilities with frontmatter descriptions
Scripts (plugins/harness/scripts/*.sh) — Shell scripts for persistence and metrics
References (plugins/harness/references/*.md) — Shared reference documents loaded by commands

Dependencies

The harness plugin works standalone. Some commands optionally integrate with:

superpowers — Used by brainstorm, plan, orchestrate, review, and complete commands for their core methodologies (brainstorming, writing-plans, subagent-driven-development, etc.)
pr-review-toolkit — Used by the review command for specialized review agents (code-reviewer, silent-failure-hunter, type-design-analyzer)

The pr plugin requires pr-review-toolkit for the review and automate commands.

Documentation Tiers

The harness plugin manages a 3-tier documentation system:

CLAUDE.md (60-120 lines) — Map with Documentation Map table
docs/*.md — Domain summaries (ARCHITECTURE, DESIGN, PLANS, QUALITY)
docs/design-docs/, docs/exec-plans/ — Deep docs, versioned plans

Self-Improvement

The harness plugin tracks review effectiveness across sessions:

Metrics — Review agent accuracy, plan prediction quality, learning efficacy
Evolution — Automatic agent definition updates based on review escapes
Adversarial review — Context-isolated production failure analysis using claude -p
Learnings — Persistent project knowledge captured and enforced across sessions

License

MIT

harness

Component Overview

Component Details

Commands (13)

Agents (3)

Skills (1)

README

chalk-bag

Plugins

harness

pr

Installation

Claude Code

Hermes

Other Agents

Dependencies

Documentation Tiers

Self-Improvement

License

Similar Plugins

vibeworks-library

harness

Component Overview

Component Details

Commands (13)

Agents (3)

Skills (1)

README

chalk-bag

Plugins

harness

pr

Installation

Claude Code

Hermes

Other Agents

Dependencies

Documentation Tiers

Self-Improvement

License

Similar Plugins

vibeworks-library

context7-plugin

deep-wiki

claude-obsidian

c4-architecture

llm-wiki