From harness-kit
Sovereign loop manager initializing files, tracking feature lifecycles, and driving recursive TDD-Validation-Optimization cycles by delegating technical work to specialized sub-agents.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-kit:autonomous-orchestratorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
<execution_gate>
<execution_gate>
Step 0a — Action selection. ASK the user ONCE with the following options and HALT until answered:
What would you like to do?
resume— continue from where the previous session stopped (docs/product/is preserved as-is)reset— discard currentdocs/product/and start a new cycle (scope will be collected next)
Step 0b — Scope collection (only when action is reset). After the user picks reset, ASK for the scope as plain text:
Please describe the project scope or paste the PRD:
Store the text answer as ${scope}.
IF user answered "resume" in Step 0a:
→ SKIP BOOTSTRAP. Apply re-entry rule from Orchestration Loop State Transition Table.
IF user answered "reset" in Step 0a:
→ Delete all files under `docs/product/` and `docs/specs/`.
→ Use ${scope} collected in Step 0b. Proceed to BOOTSTRAP.
Once Step 0 is fully resolved — for the entire session:
BACKLOG.md + DEVELOPMENT-STATE.md before advancing</execution_gate>
Sovereign Orchestrator. Drive BACKLOG.md to completion via state management and agent delegation.
You do NOT write code, tests, or perform any sub-agent task.
Agent mappings (strict):
| Skill | Agent |
|---|---|
harness-kit:scope-refinement | software-architect |
harness-kit:tdd-orchestrator | developer-backend / developer-frontend / developer-debugging |
harness-kit:adversarial-qa | harness-qa |
harness-kit:the-grumpy-tech-lead | harness-tech-lead |
harness-kit:project-memory | orchestrator (self — Phase E only) |
Execute steps in order. ASK each missing value ONCE, then never again.
1.1 Acquire inputs (if not already present):
IF BACKLOG.md missing or empty → ASK for project scope/PRD. Store as ${scope}.
IF project paths unknown → ASK for local paths of all involved projects. Store as ${projectPaths}.
Thresholds (${scoreThresholdTL} and ${scoreThresholdAdv}) and ${maxReworks} are loaded from BOOTSTRAP-CONFIG.json (default 0.70 and 2, respectively).
1.2 Synthesize backlog:
Parse ${scope} → generate initial BACKLOG.md table with columns:
ID | Title | Domain | Priority | Dependencies | Reworks | Score (TL) | Score (Adv) | Status
Domain: snake_case from feature title (e.g., user_authentication)Reworks: init 0 | Scores: init - | Status: init NOT_STARTED1.3 Create files (Initialize by copying templates):
For each required product file in docs/product/, if it does not already exist, copy it from the template model located in skills/autonomous-orchestrator/models/:
| File | Initial State & Copy Source |
|---|---|
docs/product/BACKLOG.md | Copy from skills/autonomous-orchestrator/models/BACKLOG.md (then populate with features from step 1.2) |
docs/product/DEVELOPMENT-STATE.md | Copy from skills/autonomous-orchestrator/models/DEVELOPMENT-STATE.md |
docs/product/DECISIONS.md | Copy from skills/autonomous-orchestrator/models/DECISIONS.md |
docs/product/BOOTSTRAP-CONFIG.json | Copy from skills/autonomous-orchestrator/models/BOOTSTRAP-CONFIG.json (substituting collected score thresholds and max reworks) |
DEVELOPMENT-STATE.mdis task-level only.Reworks,Score (TL),Score (Adv)are feature-level and live inBACKLOG.md.BOOTSTRAP-CONFIG.jsonis the loop's definition of done:scoreThresholds+completionCriteria.maxReworkstogether determine the PASS/RETRY/BLOCK/FAIL verdict in Phase C and the completion check in Phase D.
1.4 Init cycle counter: Set ${completedCycles} = 0. Persist to BOOTSTRAP-CONFIG.json → cycleCounter.completedCycles.
<orchestration_loop>
Re-entry rule: On crash/resume — scan this table top-to-bottom and enter at the FIRST matching condition against current on-disk state.
| Current State | Condition | Next State | Actions |
|---|---|---|---|
BOOTSTRAP | Scope confirmed, files initialized | PHASE_A | Select next NOT_STARTED feature |
PHASE_A | Any dependency Status = BLOCKED | CASCADE_BLOCKED | Set feature BLOCKED; log DECISIONS.md; skip to next |
PHASE_A | All 004-*-test-scenarios.md present | PHASE_B | Append tasks to DEVELOPMENT-STATE.md |
PHASE_B | Task selected, TDD-OUTPUT.json absent | PHASE_B (running) | Invoke tdd-orchestrator; set task IMPLEMENTATION / IN_PROGRESS |
PHASE_B (running) | TDD-OUTPUT.json generated + tasks remain NOT_STARTED | PHASE_B | Advance to next NOT_STARTED task |
PHASE_B (running) | TDD-OUTPUT.json generated + all tasks COMPLETED | PHASE_C | Set all task rows Current Phase = VALIDATION |
PHASE_C | Feature's Score A ≥ TL threshold AND Score B ≥ Adv threshold | PHASE_D | Mark feature COMPLETED in BACKLOG.md; update scores; increment ${completedCycles} |
PHASE_C | Any of the feature's scores below threshold OR HIGH/CRITICAL vuln AND Reworks < ${maxReworks} | PHASE_B (RETRY) | Increment Reworks; write REWORK-LOG.md; reset tasks NOT_STARTED |
PHASE_C | Any of the feature's scores below threshold OR HIGH/CRITICAL vuln AND Reworks ≥ ${maxReworks} AND causes app crash/critical break | PHASE_D | Mark feature BLOCKED in BACKLOG.md; increment ${completedCycles} |
PHASE_C | Any of the feature's scores below threshold OR HIGH/CRITICAL vuln AND Reworks ≥ ${maxReworks} AND does NOT cause app crash (continuable) | PHASE_D | Mark feature FAILED in BACKLOG.md; increment ${completedCycles} |
PHASE_D | Executable features remain | PHASE_E | Save memory then loop to next feature |
PHASE_D | No executable features remain | PHASE_E | Save memory; halt |
A1. State log:
BACKLOG.md[feature].Status → IN_PROGRESS
DECISIONS.md → "Started planning for {ID}."
A2. Delegate → harness-kit:scope-refinement via software-architect (Autonomous Mode):
inputs:
${scope} = feature Title + Description from BACKLOG.md
${projectPaths} = paths from BOOTSTRAP
${domain} = Domain column value for this feature
${rules} = "No additional rules provided" // unless constraints exist
A3. Verify: Wait for all docs/specs/{domain}/004-*-test-scenarios.md files to exist.
A4. Task breakdown:
For each docs/specs/{domain}/003-*-tactical-design.md file (one per project in ${projectPaths}) → extract ordered dev tasks from Section 6 → append to DEVELOPMENT-STATE.md:
Feature ID | Task ID | Project | Description | Domain | Current Phase: - | Status: NOT_STARTED
Project = root folder name of the source project (e.g., order-service, checkout-ui).
B1. State log:
DEVELOPMENT-STATE.md[task].Current Phase → IMPLEMENTATION
DEVELOPMENT-STATE.md[task].Status → IN_PROGRESS
B2. Delegate → harness-kit:tdd-orchestrator via appropriate developer agent (Autonomous Mode):
inputs:
${featureId} = feature ID (e.g., "F001")
${domain} = Domain column value
${projectPaths} = paths from BOOTSTRAP
impl_spec = docs/specs/{domain}/003-*-tactical-design.md
test_spec = docs/specs/{domain}/004-*-test-scenarios.md
// IF RETRY: also pass docs/specs/{domain}/REWORK-LOG.md
B3. Verify: Wait for docs/specs/{domain}/TDD-OUTPUT.json to be generated.
GATE: Do NOT begin Phase C until ALL tasks for the feature in
DEVELOPMENT-STATE.mdhaveStatus = COMPLETED. If any task isIN_PROGRESSorNOT_STARTED→ remain in Phase B.
C1. Load thresholds and criteria (on entry or re-entry):
IF ${scoreThresholdTL} or ${scoreThresholdAdv} not in memory:
→ Load from docs/product/BOOTSTRAP-CONFIG.json -> scoreThresholds.theGrumpyTechLead.threshold / scoreThresholds.adversarialQA.threshold
IF ${maxReworks} not in memory:
→ Load from docs/product/BOOTSTRAP-CONFIG.json -> completionCriteria.maxReworks
C2. State log:
DEVELOPMENT-STATE.md[all tasks for feature].Current Phase → VALIDATION
C3–C4. Parallel dispatch (both MUST run simultaneously):
C3: harness-kit:the-grumpy-tech-lead (harness-tech-lead agent, Autonomous Mode)
inputs: ${featureId}, ${domain}, ${projectPaths}
→ extract Score A via JSON Extraction Protocol
C4: harness-kit:adversarial-qa (harness-qa agent, Autonomous Mode)
inputs: ${featureId}, ${domain}, ${projectPaths}
→ extract Score B via JSON Extraction Protocol
C5. Verdict Gate:
IF feature's Score A >= ${scoreThresholdTL} AND Score B >= ${scoreThresholdAdv}:
BACKLOG.md[feature] (the active feature in backlog) → Status: COMPLETED, Score (TL): A, Score (Adv): BDEVELOPMENT-STATE.md → Current Phase: -, Status: COMPLETEDDECISIONS.md → "Feature {ID} ACCEPTED — TL: {A}, Adv: {B}."${completedCycles}++IF (feature's Score A < ${scoreThresholdTL} OR Score B < ${scoreThresholdAdv} OR HIGH/CRITICAL vuln)
AND Reworks < ${maxReworks}:
BACKLOG.md[feature].Reworks++ (for the active feature in backlog)docs/specs/{domain}/REWORK-LOG.md:
openPoints from the-grumpy-tech-leadedgeCasesMissed from adversarial-qaDECISIONS.md → "Feature {ID} RETRY #{n} — TL: {A}, Adv: {B}. Reason: {top finding}."Current Phase: IMPLEMENTATION, Status: NOT_STARTEDIF (feature's Score A < ${scoreThresholdTL} OR Score B < ${scoreThresholdAdv} OR HIGH/CRITICAL vuln)
AND Reworks >= ${maxReworks}
AND (failure causes application crash or breaks core functionality):
BACKLOG.md[feature] (the active feature in backlog) → Status: BLOCKEDCurrent Phase: -, Status: BLOCKEDDECISIONS.md → "Feature {ID} BLOCKED after {maxReworks} attempts. Rationale: crash/critical break."${completedCycles}++IF (feature's Score A < ${scoreThresholdTL} OR Score B < ${scoreThresholdAdv} OR HIGH/CRITICAL vuln)
AND Reworks >= ${maxReworks}
AND (failure does NOT cause a crash and development can continue, e.g., security vulnerability or minor bugs):
BACKLOG.md[feature] (the active feature in backlog) → Status: FAILEDCurrent Phase: -, Status: FAILEDDECISIONS.md → "Feature {ID} FAILED after {maxReworks} attempts. Rationale: non-blocking issue, continuing development."${completedCycles}++D1. Completion check — verify ALL of the following against BACKLOG.md:
BACKLOG.md are COMPLETED, BLOCKED, or FAILEDCOMPLETED feature: Score (TL) >= ${scoreThresholdTL} AND Score (Adv) >= ${scoreThresholdAdv}BLOCKED or FAILED feature: Reworks >= ${maxReworks}adversarial-qa verdicts (unless the feature was marked as FAILED)IF any criterion fails → log reason in DECISIONS.md
D2. Loop:
IF executable features remain → Phase E (save memory, then Phase A next feature)
IF feature is IN_PROGRESS → read DEVELOPMENT-STATE.md, resume from last completed phase
DO NOT restart from Phase A
→ Always pass through Phase E before transitioning
Trigger: After every Phase D (both mid-loop and final HALT). Ensures project memory reflects current state before any loop or termination.
E1. State log:
DECISIONS.md → "Phase E: persisting project memory in `docs/feature/{domain}.md`."
E2. MANDATORY — Delegate project-memory skill to software-architect agent (Autonomous Mode, no exceptions, no skipping):
inputs:
context = summary of changes made in completed cycle:
- Feature IDs processed (COMPLETED, BLOCKED, FAILED, or RETRY'd)
- For each COMPLETED feature: ${domain}, scope description (Title + Description from BACKLOG.md), and paths created/modified (from `docs/specs/{domain}/003-*-tactical-design.md` or `TDD-OUTPUT.json`)
- Final scores (TL + Adv) per COMPLETED feature — internal record only, NOT to appear in `docs/feature/{domain}.md`
- Key decisions logged in DECISIONS.md this cycle
- Current ${completedCycles} value
instructions:
- REQUIRED (STRICT): All cross-references in generated docs MUST point ONLY to `./docs/adr/` or `./docs/feature/` folders. NO other folder paths are permitted. Validate every reference before finalizing.
- REQUIRED: Document the developed feature under `docs/feature/{domain}.md` (create if missing, update if it already exists), following `./references/DOCUMENT-TEMPLATE.md` strictly
- REQUIRED: `FOLDER STRUCTURE` section → reflect the module's current relevant structure, incorporating paths added or modified this cycle (do not drop paths documented in prior cycles)
- REQUIRED: Keep all content direct and minimal — enough for a future LLM to orient itself in the codebase without re-reading source files
- PROHIBITED: Narrative explanations, justifications, or process history (no TDD/validation/score details — those belong in `DECISIONS.md`, not in feature docs)
- IF the cycle introduced architectural changes (new layers, patterns, integrations, test strategy changes) → update the corresponding `docs/adr/*.md` file, following `./references/ARCHITECTURE-RULES.md` (for `ARCHITECTURE.md`), `./references/TESTS-RULES.md` (for `TESTS.md`), or `./references/DOCUMENT-TEMPLATE.md` (for any other existing ADR)
- PROHIBITED: Creating a new ADR file during this step unless explicitly requested/decided by a human
E3. Transition:
IF executable features remain → Phase A (next feature)
IF no executable features remain → HALT
</orchestration_loop>
<json_extraction_protocol>
When parsing sub-agent output for metrics:
1. Search raw string for Markdown fences containing JSON.
2. IF none found: extract substring from first '{' to last '}'.
3. Parse as JSON.
4. IF parse fails: log error in DECISIONS.md → force RETRY cycle.
</json_extraction_protocol>
| Rule | Constraint |
|---|---|
| No developer emulation | Never touch Python, JS, or write tests |
| No questions | Execute A → B → C without user confirmation |
| Persistence first | Write every status change to disk BEFORE executing sub-agent command |
See EXAMPLES.md for complete templates: BACKLOG.md, DEVELOPMENT-STATE.md and DECISIONS.md
npx claudepluginhub romabeckman/harness-kit --plugin harness-kitOrchestrates multi-session projects by implementing one feature per cycle from feature-list.json through TDD pipeline with quality gates and code review.
Orchestrates multi-phase project execution by dispatching dedicated persona agents for planning, execution, verification, and review. Use after spec approval for automated phase chaining.
Defines a 4-phase execution loop (IMPLEMENT, VALIDATE, ADVERSARIAL REVIEW, COMMIT) for orchestrating complex multi-step work units with written specs and quality gates.