Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By Jinsong-Zhou
Multi-agent harness framework for long-running application development. Implements generator-evaluator architecture with planner, generator, and evaluator agents for autonomous multi-hour coding sessions.
npx claudepluginhub jinsong-zhou/cc-harness --plugin cc-harnessTrigger a standalone evaluation of the current application using the skeptical evaluator agent
Show current harness session status — iteration count, scores, feature progress, and session metrics
Start a harnessed development session with planner → generator → evaluator loop for autonomous multi-hour coding
Use PROACTIVELY after generator completes a feature. Skeptical QA agent that tests the live application via Playwright/browser, grades against sprint contracts and quality criteria. Deliberately resistant to approving mediocre work.
Use when building features from a harness spec. Implements features iteratively one at a time, writes sprint contracts, self-evaluates, commits to git, and hands off to the evaluator via files.
Use PROACTIVELY when starting a harnessed build. Converts brief 1-4 sentence prompts into ambitious, detailed product specs. Focuses on product context and high-level technical design, not implementation details.
Use when running long coding sessions that approach context limits - provides strategies for compaction vs context resets, structured handoffs, and managing context anxiety across model versions
Use when building complex applications autonomously - orchestrates a generator-evaluator iteration loop with planner, generator, and evaluator agents for long-running multi-hour coding sessions
Use when optimizing or simplifying a multi-agent harness - guides systematic removal of components as models improve, evaluator calibration, and prompt engineering for grading criteria
Uses power tools
Uses Bash, Write, or Edit tools
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Describe your goal, approve the spec, then step away — Claude and Codex loop together until it's right.
Claude harness - A harness for solo developers (Vibecoders) to handle full-cycle contract development.
Long-running agent harness with 5-layer memory architecture, GitHub integration, autonomous batch processing, Agent Teams with ATDD, 9 hooks (safety, quality gates, team coordination), and 6 Agent Skills
Harness for Claude Code — skills, /harness:* slash commands, persona subagents, lifecycle hooks, and MCP tools without per-repo `harness setup`. Sibling plugins exist for Cursor, Gemini CLI, and Codex.
Session harness plugin for Claude Code workflow automation
Production-ready Claude Code configuration with role-based workflows (PM→Lead→Designer→Dev→QA), safety hooks, 44 commands, 19 skills, 8 agents, 43 rules, 30 hook scripts across 19 events, auto-learning pipeline, hook profiles, and multi-language coding standards
🌐 Language / 语言
English •
简体中文 •
日本語 •
한국어
Multi-agent harness framework for long-running application development with Claude Code.
Based on Anthropic's research: Harness Design for Long-Running Application Development
Single-agent approaches to complex coding tasks produce superficially impressive but often broken results. Two persistent problems:
This plugin implements a generator-evaluator architecture (inspired by GANs) that drives real quality through iterative feedback loops.
User Prompt (1-4 sentences)
│
▼
┌──────────────┐
│ PLANNER │──→ SPEC.md (ambitious product spec)
│ (read-only) │
└──────────────┘
│
▼
┌──────────────┐ file-based ┌──────────────┐
│ GENERATOR │◄──────communication────────►│ EVALUATOR │
│ (builds) │ │ (tests) │
└──────────────┘ └──────────────┘
│ │
git commit tests live app
▲ via Playwright/
└────────iterate if FAIL──────────────browser tools
# Add the marketplace
/plugin marketplace add Jinsong-Zhou/cc-harness
# Install the plugin
/plugin install cc-harness@cc-harness-marketplace
Then restart Claude Code.
# Start a harnessed development session
/harness Build a browser-based DAW using the Web Audio API
# Evaluate current work at any point
/evaluate
/evaluate the login flow
/evaluate frontend design
# Check session progress
/harness-status
cc-harness/
├── .claude-plugin/
│ ├── plugin.json # Plugin identity & metadata
│ └── marketplace.json # Marketplace distribution config
│
├── agents/
│ ├── planner.md # Expands prompts → ambitious product specs
│ ├── generator.md # Builds features one at a time, commits to git
│ └── evaluator.md # Skeptical QA — tests live apps, grades against criteria
│
├── skills/
│ ├── harness-loop/
│ │ ├── SKILL.md # Core generator-evaluator iteration loop orchestration
│ │ └── references/
│ │ ├── sprint-contract-examples.md # Worked examples of sprint contracts
│ │ └── evaluation-examples.md # Calibrated QA feedback examples from real runs
│ ├── context-management/
│ │ └── SKILL.md # Compaction vs reset strategies for long sessions
│ └── harness-tuning/
│ ├── SKILL.md # Evaluator calibration & harness simplification
│ └── references/
│ └── audit-template.md # Template for auditing harness component necessity
│
├── commands/
│ ├── harness.md # /harness — start a harnessed development session
│ ├── evaluate.md # /evaluate — trigger standalone QA evaluation
│ └── harness-status.md # /harness-status — show session progress & scores
│
├── rules/
│ └── common/
│ ├── harness-workflow.md # Mandatory pipeline: plan → contract → build → evaluate
│ ├── evaluator-discipline.md # Evaluator mindset rules — skepticism over politeness
│ ├── context-strategy.md # When to compact vs reset, model-specific guidance
│ └── file-communication.md # File-based agent communication protocol
│
├── hooks/
│ └── hooks.json # Lifecycle hooks: SessionStart, PreCompact, Stop
│
├── scripts/
│ └── hooks/
│ ├── run-with-flags.js # Profile-based hook runner (minimal/standard/strict)
│ └── track-iteration.js # Iteration counter, state persistence, session summaries
│
├── examples/
│ └── todo-app-harness/ # Complete worked example of a harness session
│ ├── SPEC.md # Example planner output