Search everything...

Stats

Actions

Available In

harness

Name: harness
Author: toolsbbb

By toolsbbb

BELCORT Planner → Generator → Evaluator pipeline for Claude Code. Adapted from Anthropic's harness engineering research. Opinionated, file-based, git-tracked.

npx claudepluginhub toolsbbb/belcort-harness --plugin harness

Popularity

Stars

Med: 0·Avg: 284

Installs

Med: 0·Avg: 1

Forks

Med: 0·Avg: 36

Health & Quality

Maintenance

Fair5.0/10

Med: 7/10·Avg: 7.4/10

Community

14%

Med: 42%·Avg: 42.1%

What's Inside

Slash Commands11

`/harness:analyze` — Cross-artifact consistency check

/analyze

Cross-artifact consistency check (SpecKit-inspired) — verifies PRD coverage, NFR alignment, constitution compliance across spec and feature contracts. CRITICAL findings halt the pipeline; warnings pass through. Writes analysis-report.md.

`/harness:audit` — Verification debt check

/audit

Verification debt scan (GSD-inspired) — finds deferred issues, stale known-issues, silent skips, TODO/FIXME without owners across the harness state.

`/harness:edit` — Targeted spec edit

/edit

Targeted modification of an existing spec file with downstream reference updates (BMAD tri-modal). Example — change DB stack, edit propagates to architecture, init.sh, and NFRs.

`/harness:negotiate` — Generator ↔ Evaluator contract negotiation

/negotiate

Generator ↔ Evaluator contract negotiation round. Generator proposes HOW, Evaluator reviews, iterate up to 3 rounds before code is written. Bridges the Planner's "what/why" to a testable "how".

`/harness:quick` — Fast Mode

/quick

Fast pipeline — skip Planner, use minimal contract, single build + QA pass. Use for small well-defined tasks (<30 min).

Agents3

<SUBAGENT-CONTEXT>

<SUBAGENT-CONTEXT>

<SUBAGENT-CONTEXT>

Skills1

harness

/harness

BELCORT Planner → Generator → Evaluator pipeline. Invoke when the user runs a /harness:* slash command, when .harness/manifest.yaml is present, or when the user describes a substantial build task (3+ components, >15 minutes of work) and has not yet activated the harness. The procedure for each command lives in commands/*.md — this skill is the shared context: activation rules, agent communication protocol, subagent isolation, and TDD contract.

Hooks1

Event Hooks

Bash

2 hooks across 2 events

MCP Servers2

context7

playwright

The plugin manifest points to a different repository than the source indexed by ClaudePluginHub.

Stats

Version1.2.0

Stars0

MaintenanceFair

LicenseMIT

Last CommitApr 18, 2026

AddedMay 10, 2026

Actions

View on GitHub View README Plugin Marketplace JSON Homepage

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

belcort-harness

Safety Signals

Caution

Executes bash commands

Hook triggers when Bash tool is used

Uses power tools

Uses Bash, Write, or Edit tools

README

BELCORT Harness

An opinionated harness for Claude Code that implements a Planner → Generator → Evaluator pipeline, inspired by Anthropic's published research on long-running agent harness design.

Built for Claude Opus 4.6+. Tuned for TypeScript/Node.js full-stack projects but adaptable.

What this is

A set of Skills, agent prompts, and hooks that plug into Claude Code to enable autonomous multi-agent software development. You give Claude a 1–4 sentence prompt and the harness orchestrates planning, contract negotiation, test-driven implementation, adversarial QA, and retrospective drift analysis — all file-based, all auditable via git.

This is NOT a framework or a library. It's a set of markdown files that shape how Claude Code behaves when working on substantial projects.

Origin

Based on Anthropic Labs' engineering blog post Harness design for long-running application development (Rajasekaran, 2026) and its predecessor Effective harnesses for long-running agents.

See docs/anthropic-alignment.md for a point-by-point mapping between design decisions in this harness and the source material.

Core design decisions (and their Anthropic-article basis)

Decision	Source
Three agents (Planner / Generator / Evaluator) as separate subagents	GAN-inspired architecture described in the Anthropic post
Evaluator MUST have separate context from Generator	"Separating the agent doing the work from the agent judging it proves to be a strong lever"
Planner outputs high-level direction only, NOT file paths or components	"stay focused on product context and high level technical design rather than detailed technical implementation"
Generator and Evaluator negotiate a sprint contract BEFORE any code is written	"Before each sprint, the generator and evaluator negotiated a sprint contract... before any code was written"
File-based agent communication	"Communication was handled via files: one agent would write a file, another agent would read it..."
Evaluator grades against 4 hard-threshold criteria	"Each criterion had a hard threshold, and if any one fell below it, the sprint failed"
Few-shot calibration examples for Evaluator scoring	"I calibrated the evaluator using few-shot examples with detailed score breakdowns"
Tuning loop: capture human-Evaluator divergence, refine over time	"The tuning loop was to read the evaluator's logs, find examples where its judgment diverged from mine..."
Criteria weighting emphasizes model's weak dimensions	"by weighting design and originality more heavily it pushed the model toward more aesthetic risk-taking"
Criteria wording deliberately chosen (shapes Generator output, not just Evaluator scoring)	"The wording of the criteria steered the generator in ways I didn't fully anticipate"

Installation

Requires Claude Code installed and working.

Option A — Plugin install (recommended)

/plugin marketplace add mosaladtaooo/belcort-harness
/plugin install harness@belcort-harness
/harness:setup

That's it. The plugin auto-registers the skill, three agents, ten slash commands, two hooks (SessionStart + PreToolUse), and two MCP servers (context7 + playwright). The one-time /harness:setup command patches ~/.claude/CLAUDE.md with the harness behavioral rules so they apply globally and survive context compaction. The patch is idempotent, version-aware, and removable (scripts/uninstall-rules.sh).

Option B — Manual install (legacy)

git clone https://github.com/mosaladtaooo/belcort-harness.git
cd belcort-harness
./install/install.sh

Complete the two manual steps the installer prints (append CLAUDE.md snippet, register hooks in ~/.claude/settings.json).

Verify either install:

./install/verify.sh

Quick start

Once installed, in any project directory:

# Start Claude Code, then:
/harness:sprint "Build a minimal bookmark manager with tags and search"

The harness will orchestrate planning, negotiation, build, and evaluation across the session. Your feedback gets captured into the Evaluator tuning loop for next time.

Commands

View full README on GitHub

harness

Popularity

Health & Quality

What's Inside

Confidence

README

BELCORT Harness

What this is

Origin

Core design decisions (and their Anthropic-article basis)

Installation

Option A — Plugin install (recommended)

Option B — Manual install (legacy)

Quick start

Commands

Similar Plugins

llm-council-plugin

caveman

ui-design

claude-mem

nanobanana

product-management

BELCORT Harness

What this is

Origin

Core design decisions (and their Anthropic-article basis)

Installation

Option A — Plugin install (recommended)

Option B — Manual install (legacy)

Quick start

Commands

Similar Plugins

llm-council-plugin

caveman

ui-design

claude-mem

nanobanana

product-management

Popularity

Health & Quality