By stone16
Orchestrate complex engineering tasks with a cybernetics-based multi-agent pipeline: Planner specs checkpoints, Generator implements via TDD/atomic commits, Evaluator verifies with tiered checks, Retro analyzes failures for CLAUDE.md improvements. Run iterative cross-LLM code reviews with Codex/Gemini peers until consensus on fixes.
Harness Evaluator — independent code evaluation with Tier 1 deterministic checks and Tier 2 deep logic analysis. Use when harness orchestrator needs checkpoint evaluation.
Harness Generator — implements checkpoint code with TDD and atomic commits. Use when harness orchestrator needs code generation for a checkpoint.
Harness Retro — post-task retrospective analysis, error pattern detection, and CLAUDE.md rule proposals. Use when harness orchestrator needs task retrospective.
Harness Spec Evaluator — reviews spec.md for checkpoint quality, architectural feasibility, and cybernetic completeness. Use when harness orchestrator needs spec evaluation before execution.
Cybernetics-based multi-agent orchestration for complex tasks. Coordinates a Planner → Generator → Evaluator → Retro pipeline with clean-context sub-agents, per-checkpoint drift prevention, and persistent retro learning. Recommended workflow: Claude Code plans the spec (Session 1), Codex executes autonomously (Session 2), Gemini reviews as cross-model peer via `review-loop`. Use when: "harness this task", "use harness", "orchestrate this", "harness plan", "harness continue", "harness execute <task-id>", "harness <spec-name>", or when a task requires structured multi-agent coordination.
Cross-LLM iterative code review loop. Spawns a peer reviewer (Codex or Gemini CLI) to review code changes, then iterates until both agents agree on the final code state. Code gets modified during the loop — the final output is improved code + consensus report. Use when: "review loop", "peer review", "cross review", "review with codex", "review with gemini", "让 codex review", "交叉 review", "peer review 这段代码", "code review loop", "iterative review"
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Stometa's public curated Claude Code skillset — a small, opinionated set of skills we use ourselves, published periodically.
This is the public companion to Stometa's private stometa-skillset. We dogfood a larger internal skillset day-to-day; selected skills are extracted, polished, and published here in batches. The goal is to share the workflows that actually hold up under real engineering work — not a pile of prototypes.
The first batch ships two skills: review-loop (already proven in daily use) and harness (multi-agent orchestration for larger tasks). Both are installable as a single Claude Code plugin.
harness is a cybernetics-inspired orchestrator: planning and execution live in separate sessions so context cannot leak; every checkpoint runs against a fresh sub-agent to reset eigenbehavior; an engine script owns state and enforces hard gates so the LLM cannot self-certify; and a cross-model peer (a different vendor's CLI) reviews before PR so you never merge on a single model's opinion. Persistent retro feeds learnings back into future tasks — that's the closing loop of the cybernetic system.
flowchart TB
H((Human))
H -->|"harness plan task-id"| HOST
subgraph HOST["Orchestrator host — pick one (symmetric)"]
direction LR
CC["Claude Code CLI"]
CX["Codex CLI"]
end
HOST --> ENG[["harness-engine.sh<br/>single source of truth<br/>state · phase machine · hard gates"]]
ENG --> S1
subgraph S1["Session 1 — Planning (recommended host: Claude Code)"]
direction TB
PL["Orchestrator = Planner<br/>brainstorm + draft spec.md"]
SE["harness-spec-evaluator<br/>fresh sub-agent (Claude)"]
OK1["spec.md approved"]
PL --> SE
SE -->|revise| PL
SE -->|approve| OK1
end
S1 -. session ends — planning context discarded .-> S2
subgraph S2["Session 2 — Execution (recommended host: Codex)"]
direction TB
CPL{{"For each Checkpoint NN"}}
GEN["harness-generator<br/>fresh sub-agent per CP<br/>TDD + verification preloaded"]
EVL{"harness-evaluator<br/>fresh sub-agent per CP<br/>Tier 1 deterministic + Tier 2 LLM"}
MORE{"more checkpoints?"}
E2E["E2E Evaluator<br/>cross-checkpoint data-flow audit"]
RL[["review-loop<br/>cross-model quality gate"]]
FV["full-verify<br/>tests · coverage ≥ threshold · lint · types"]
PR["Open PR"]
RT["harness-retro<br/>fresh sub-agent"]
CPL --> GEN --> EVL
EVL -->|FAIL / REVIEW| GEN
EVL -->|PASS| MORE
MORE -->|yes| CPL
MORE -->|no| E2E
E2E -->|FAIL| GEN
E2E -->|PASS| RL
RL --> FV --> PR --> RT
end
subgraph RLSUB["review-loop · cross-LLM peer review"]
direction LR
PEER["Peer reviewer CLI<br/>codex OR gemini<br/>(allowlisted)"]
HEV["Host LLM evaluates<br/>ACCEPT / REJECT / INSIST"]
FRESH["Fresh peer session<br/>final approval pass"]
DONE["pass-review-loop"]
PEER -->|FINDING fN| HEV
HEV -->|fix + commit| PEER
PEER -.CONSENSUS.-> FRESH --> DONE
end
RL -. invokes .-> RLSUB
RT --> RD[(".harness/retro/<br/>cross-task learnings<br/>git-tracked, persistent")]
RD -. informs future tasks .-> H
classDef antiDrift stroke:#d97706,stroke-width:2px;
class GEN,EVL,SE,RT antiDrift;
classDef gate stroke:#059669,stroke-width:2px;
class ENG,RL,FV gate;
Legend — orange-bordered nodes are the fresh-sub-agent drift firewalls; green-bordered nodes are the engine-enforced gates that the LLM cannot bypass.
The model running each role is decoupled from the model hosting the session — that's why the same pipeline works whether you start in Claude Code or Codex.
npx claudepluginhub stone16/harness-engineering-skills --plugin harness-engineering-skillsHarness engineering for Claude Code — hook-enforced dual review, state-machine gates, and fail-closed safety where it counts.
Describe your goal, approve the spec, then step away — Claude and Codex loop together until it's right.
v9.44.1 — Patch release for Gemini environment/version detection and qwen auth gating. Run /octo:setup.
Verification-first engineering toolkit for Claude Code. 15 skills across a 5-phase spine (Investigate → Design → Implement → Verify → Ship), 8 specialist agents, an interactive setup wizard. Every skill has rationalizations + evidence requirements. Built for senior ICs and tech leads.
Autonomous session orchestrator for Claude Code - manages multi-phase development workflows
Use the Pi coding agent from Claude Code to review code or delegate tasks.