Name: cc-harness
Author: jinsong-zhou

Stats

Actions

Available In

Help us improve

Share bugs, ideas, or general feedback.

cc-harness

Multi-agent harness framework for long-running application development with Claude Code.

Why This Exists

Single-agent approaches to complex coding tasks produce superficially impressive but often broken results. Two persistent problems:

Self-evaluation bias — Models confidently praise their own mediocre work. Separating creation from evaluation is the strongest lever for quality.

Context deterioration — Models lose coherence over long sessions. Structured handoffs and context management keep multi-hour builds on track.

This plugin implements a generator-evaluator architecture (inspired by GANs) that drives real quality through iterative feedback loops.

Architecture

User Prompt (1-4 sentences) │ ▼ ┌──────────────┐ │ PLANNER │──→ SPEC.md (ambitious product spec) │ (read-only) │ └──────────────┘ │ ▼ ┌──────────────┐ file-based ┌──────────────┐ │ GENERATOR │◄──────communication────────►│ EVALUATOR │ │ (builds) │ │ (tests) │ └──────────────┘ └──────────────┘ │ │ git commit tests live app ▲ via Playwright/ └────────iterate if FAIL──────────────browser tools

Quick Start

Installation

# Add the marketplace /plugin marketplace add Jinsong-Zhou/cc-harness # Install the plugin /plugin install cc-harness@cc-harness-marketplace

Then restart Claude Code.

Usage

# Start a harnessed development session /harness Build a browser-based DAW using the Web Audio API # Evaluate current work at any point /evaluate /evaluate the login flow /evaluate frontend design # Check session progress /harness-status

What's Inside

cc-harness/ ├── .claude-plugin/ │ ├── plugin.json # Plugin identity & metadata │ └── marketplace.json # Marketplace distribution config │ ├── agents/ │ ├── planner.md # Expands prompts → ambitious product specs │ ├── generator.md # Builds features one at a time, commits to git │ └── evaluator.md # Skeptical QA — tests live apps, grades against criteria │ ├── skills/ │ ├── harness-loop/ │ │ ├── SKILL.md # Core generator-evaluator iteration loop orchestration │ │ └── references/ │ │ ├── sprint-contract-examples.md # Worked examples of sprint contracts │ │ └── evaluation-examples.md # Calibrated QA feedback examples from real runs │ ├── context-management/ │ │ └── SKILL.md # Compaction vs reset strategies for long sessions │ └── harness-tuning/ │ ├── SKILL.md # Evaluator calibration & harness simplification │ └── references/ │ └── audit-template.md # Template for auditing harness component necessity │ ├── commands/ │ ├── harness.md # /harness — start a harnessed development session │ ├── evaluate.md # /evaluate — trigger standalone QA evaluation │ └── harness-status.md # /harness-status — show session progress & scores │ ├── rules/ │ └── common/ │ ├── harness-workflow.md # Mandatory pipeline: plan → contract → build → evaluate │ ├── evaluator-discipline.md # Evaluator mindset rules — skepticism over politeness │ ├── context-strategy.md # When to compact vs reset, model-specific guidance │ └── file-communication.md # File-based agent communication protocol │ ├── hooks/ │ └── hooks.json # Lifecycle hooks: SessionStart, PreCompact, Stop │ ├── scripts/ │ └── hooks/ │ ├── run-with-flags.js # Profile-based hook runner (minimal/standard/strict) │ └── track-iteration.js # Iteration counter, state persistence, session summaries │ ├── examples/ │ └── todo-app-harness/ # Complete worked example of a harness session │ ├── SPEC.md # Example planner output

🌐 Language / 语言
English • 简体中文 • 日本語 • 한국어

cc-harness

Multi-agent harness framework for long-running application development with Claude Code.

Based on Anthropic's research: Harness Design for Long-Running Application Development

Why This Exists

Single-agent approaches to complex coding tasks produce superficially impressive but often broken results. Two persistent problems:

Self-evaluation bias — Models confidently praise their own mediocre work. Separating creation from evaluation is the strongest lever for quality.
Context deterioration — Models lose coherence over long sessions. Structured handoffs and context management keep multi-hour builds on track.

This plugin implements a generator-evaluator architecture (inspired by GANs) that drives real quality through iterative feedback loops.

Architecture

User Prompt (1-4 sentences)
        │
        ▼
┌──────────────┐
│   PLANNER    │──→ SPEC.md (ambitious product spec)
│  (read-only) │
└──────────────┘
        │
        ▼
┌──────────────┐         file-based          ┌──────────────┐
│  GENERATOR   │◄──────communication────────►│  EVALUATOR   │
│  (builds)    │                              │  (tests)     │
└──────────────┘                              └──────────────┘
        │                                            │
   git commit                                  tests live app
        ▲                                     via Playwright/
        └────────iterate if FAIL──────────────browser tools

Quick Start

Installation

# Add the marketplace
/plugin marketplace add Jinsong-Zhou/cc-harness

# Install the plugin
/plugin install cc-harness@cc-harness-marketplace

Then restart Claude Code.

Usage

# Start a harnessed development session
/harness Build a browser-based DAW using the Web Audio API

# Evaluate current work at any point
/evaluate
/evaluate the login flow
/evaluate frontend design

# Check session progress
/harness-status

What's Inside

cc-harness/
├── .claude-plugin/
│   ├── plugin.json                              # Plugin identity & metadata
│   └── marketplace.json                         # Marketplace distribution config
│
├── agents/
│   ├── planner.md                               # Expands prompts → ambitious product specs
│   ├── generator.md                             # Builds features one at a time, commits to git
│   └── evaluator.md                             # Skeptical QA — tests live apps, grades against criteria
│
├── skills/
│   ├── harness-loop/
│   │   ├── SKILL.md                             # Core generator-evaluator iteration loop orchestration
│   │   └── references/
│   │       ├── sprint-contract-examples.md      # Worked examples of sprint contracts
│   │       └── evaluation-examples.md           # Calibrated QA feedback examples from real runs
│   ├── context-management/
│   │   └── SKILL.md                             # Compaction vs reset strategies for long sessions
│   └── harness-tuning/
│       ├── SKILL.md                             # Evaluator calibration & harness simplification
│       └── references/
│           └── audit-template.md                # Template for auditing harness component necessity
│
├── commands/
│   ├── harness.md                               # /harness — start a harnessed development session
│   ├── evaluate.md                              # /evaluate — trigger standalone QA evaluation
│   └── harness-status.md                        # /harness-status — show session progress & scores
│
├── rules/
│   └── common/
│       ├── harness-workflow.md                  # Mandatory pipeline: plan → contract → build → evaluate
│       ├── evaluator-discipline.md              # Evaluator mindset rules — skepticism over politeness
│       ├── context-strategy.md                  # When to compact vs reset, model-specific guidance
│       └── file-communication.md                # File-based agent communication protocol
│
├── hooks/
│   └── hooks.json                               # Lifecycle hooks: SessionStart, PreCompact, Stop
│
├── scripts/
│   └── hooks/
│       ├── run-with-flags.js                    # Profile-based hook runner (minimal/standard/strict)
│       └── track-iteration.js                   # Iteration counter, state persistence, session summaries
│
├── examples/
│   └── todo-app-harness/                        # Complete worked example of a harness session
│       ├── SPEC.md                              # Example planner output

Help us improve

Find plugins for your project

Help us improve

cc-harness

Popularity

What's Inside

Help us improve

Health & Quality

Confidence

README

cc-harness

Why This Exists

Architecture

Quick Start

Installation

Usage

What's Inside

Similar Plugins

tandemkit

claude-code-harness

claude-harness

harness-claude

harness-session

cc-best

cc-harness

Why This Exists

Architecture

Quick Start

Installation

Usage

What's Inside