Help us improve
Share bugs, ideas, or general feedback.
SDD (Spec-Driven Development) harness for high-quality AI engineering
npx claudepluginhub zxdxjtu/harnessSDD (Spec-Driven Development) harness for high-quality AI engineering. Guides spec → test → implement → verify workflow with auto-loop sprint execution.
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Production-ready workflow orchestration with 84 marketplace plugins, 192 local specialized agents, and 156 local skills - optimized for granular installation and minimal token usage
Directory of popular Claude Code extensions including development tools, productivity plugins, and MCP integrations
Share bugs, ideas, or general feedback.
A Claude Code marketplace plugin that enforces a structured Spec → Test → Implement → Verify pipeline for high-quality AI-assisted software engineering.
/plugin marketplace add zxdxjtu/harness
/proposal "Add user authentication with JWT" # Generate spec + design
/tdd-align F001 # Generate tests (all RED)
/decompose F001 # Break into atomic tasks
/sprint F001 # Auto-execute until done
Standard:
/proposal → /tdd-align → /decompose → /sprint → /verify
Spec Tests Tasks Execute Verify
(human) (human) (human) (auto) (auto)
Clone/Replicate (with adversarial evaluation):
/baseline → /proposal → ... → /sprint → /evaluate → /eval-fix
Capture Spec Execute Evaluate Fix Loop
(auto) (human) (auto) (Evaluator) (GAN loop)
/proposal)Interactive dialogue to produce a complete, unambiguous spec with zero-decision-point checklist. Human approves before proceeding.
/tdd-align)Generate three-layer tests from spec:
All tests start RED. Human approves the test contract.
/decompose)Break tests into atomic tasks (≤2h each), analyze dependencies, assign parallel execution waves. Human approves the task DAG.
/sprint)Automatic loop execution — powered by a Stop Hook that keeps Claude running until all tasks complete:
/verify)V1 → V2 → V3 staged verification with evidence package.
Inspired by GAN-style adversarial design — separate Generator and Evaluator agents to prevent self-assessment bias.
/baseline <url> — Evaluator Agent explores the reference product via Playwright MCP, captures screenshots, interaction flows, and visual specs into .harness/baseline/.
/evaluate <id> --ref-url <url> --dev-url <url> — Independent Evaluator compares reference vs development product across 4 dimensions:
| Dimension | Weight |
|---|---|
| Functional Completeness | 40% |
| Interaction Consistency | 25% |
| Visual Fidelity | 20% |
| Technical Quality | 15% |
/eval-fix <id> — GAN-style fix loop: Generator fixes gaps → Evaluator re-scores → repeat until convergence or stagnation.
| Command | Description |
|---|---|
/proposal [desc] | Start new feature (interactive spec generation) |
/tdd-align <id> | Generate three-layer tests from spec |
/decompose <id> | Break into atomic task DAG |
/sprint <id> | Auto-loop execute all tasks |
/verify <id> | Three-stage verification |
/harness-status | Check current state and next step |
/baseline <url> | Capture reference product baseline (Playwright MCP) |
/evaluate <id> | Adversarial comparison: reference vs dev product |
/eval-fix <id> | GAN-style fix-evaluate loop until convergence |
/cancel-sprint | Stop active sprint loop |
/help | Show documentation |
Harness stores state in .harness/ (auto-created):
.harness/
├── specs/ # Feature specifications
├── designs/ # Design documents
├── baseline/ # Reference product baseline (clone scenarios)
│ ├── baseline-report.md
│ ├── screenshots/
│ └── features/
├── tasks.md # Task DAG
├── progress.md # Progress log
├── evidence/ # Verification + evaluation evidence
│ └── FXXX/
│ ├── eval-report.md
│ ├── eval-screenshots/
│ └── eval-loop-state.md
└── sprint-loop.md # Sprint state (runtime)
Add .harness/ to .gitignore or commit it — your choice.
MIT