Run independent implementation tracks in parallel with disciplined review, merge, and validation gates.
From workflow-orchestrationnpx claudepluginhub mikecubed/agent-orchestration --plugin workflow-orchestrationThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Implements structured self-debugging workflow for AI agent failures: capture errors, diagnose patterns like loops or context overflow, apply contained recoveries, and generate introspection reports.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Use this skill when a developer wants to implement a reviewed plan or task list with multiple parallel tracks without losing control of testing, integration, or review quality.
This is an execution skill, not a planning skill. Use it only after the feature already has accepted requirements, a plan, or an actionable task list.
Persistent team, squad, or fleet-style long-lived orchestration is out of scope for this skill. Use a separate orchestration layer if persistent coordination is needed.
Activate when the developer asks for things like:
Also activate when:
Before you start, identify:
If any of those inputs are missing, stop and get them first.
Use separate roles for:
Keep implementation and review separate whenever possible.
In Claude Code, spawn each role as a separate agent using the Agent tool. Pass the implementer a scoped prompt with exact task, file, and TDD constraints. Pass the reviewer only the diff and the review criteria. Keep implementation and review judgment separate. The coordinator may share a factual brief that includes task boundaries, files, validation commands, and known dependencies. Do not share proposed conclusions, review verdicts, or implementation rationale across roles.
If the active runtime offers a higher-cost orchestration mode such as a Fleet command or Claude Code agent teams, treat it as an explicit escalation path rather than the default execution mode.
Default to standard per-role agents first. Before escalating, explain why the extra coordination would help and ask the developer whether they want the higher-cost mode.
Escalate only when:
When team mode is approved:
Resolve the active model for each role using this priority chain:
Project config — look for the runtime-specific config file in the current project root:
.copilot/models.yaml.claude/models.yamlThese are plain YAML files (no markdown, no fenced blocks). Read the implementer, reviewer, and scout keys directly. If a key is absent, fall back to the baked-in default for that role — do not re-prompt for a key that is missing.
Session cache — if models were already confirmed earlier in this session, reuse them without asking again.
Baked-in defaults — if neither config file nor session cache exists, show the defaults below, ask the user to confirm or override them once, then cache the answer for the rest of the session.
The config files are plain YAML (not markdown). Create the file for the active runtime and set only the keys you want to override — absent keys fall back to the baked-in defaults. The keys for this skill are:
implementer: <model-name>
reviewer: <model-name>
scout: <model-name>
See docs/models-config-template.md in this plugin for ready-to-copy templates for both runtimes.
| Runtime | Role | Default model |
|---|---|---|
| Copilot CLI | Implementer | claude-opus-4.6 |
| Copilot CLI | Reviewer | gpt-5.4 |
| Copilot CLI | Scout | claude-haiku-4.5 |
| Claude Code | Implementer | claude-opus-4.6 |
| Claude Code | Reviewer | claude-opus-4.6 |
| Claude Code | Scout | claude-haiku-4.5 |
Before launching tracks:
If in doubt, serialize.
Each implementation track must:
Do not allow implementation-first drift just because work is parallel.
Default to 2-3 concurrent tracks unless the repository already has strong task isolation, independent validation paths, and a proven sandbox strategy such as dedicated worktrees.
More concurrency is only worth it when:
If the workflow uses worktrees, branches, or other sandboxes (in Claude Code, use the Agent tool with isolation: "worktree" — it auto-cleans if no changes are made):
active, merged, abandoned, blocked);Use a compact artifact format before launching each track. For example:
Track: api-validation
Tasks: task-12, task-14
Files: src/api/validators.js, test/api/validators.test.js
Dependencies: task-11 done
Validation: npm test -- test/api/validators.test.js
Work surface: git worktree ../wt-api-validation
State: active
Persist the same fields in whatever task tracker, scratch file, or branch note the repository uses.
Before launching tracks:
Before delegating to expensive implementers or reviewers, run one lightweight discovery pass using the scout model to produce a discovery brief. The scout gathers factual context — relevant files, task boundaries, validation commands, dependencies, and open questions — so that downstream roles do not repeat the same exploration.
Run the scout once per batch or session, not once per track. Implementers and reviewers inherit only the slice of the brief they need.
Use the discovery brief template from docs/workflow-artifact-templates.md:
Task summary: <one-paragraph description of the work>
Task shape: single-track | multi-track-batch | review-resolution-batch | large-diff-readiness
Relevant files: <path>, <path>
Task boundaries: <what is in scope and what is not>
Validation commands: <command>, <command>
Dependencies: <known dependencies or shared interfaces, if multi-track>
Comparison baseline: <branch, commit, or PR reference, if review or readiness>
Open questions: <questions requiring developer input, or none>
Skip reason: <if discovery was skipped, why>
Skip condition: Skip the scout when the task is already narrow and fully scoped — one file, one well-defined bug fix, one known test failure, or one already-triaged review comment. When skipped, record the skip reason in the brief.
Before launching any agents, verify:
git worktree add ../wt-{track} {branch}; Claude Code: Agent tool with isolation: "worktree";Do not launch any agent until all worktrees exist and paths are confirmed.
For each track:
docs/workflow-artifact-templates.md, initializing the known fields:
After launching tracks, the coordinator monitors each track through a bounded lifecycle of budget transitions. This policy prevents silent stalls and ensures partial work is never lost.
Budget transitions
The coordinator re-evaluates budget status after every delegation round. Do not wait for a track to go fully silent before checking.
After a track finishes:
Do not spend review budget on style-only nits.
Update the track report after review so it records the current state, validation outcome, unresolved issues, and next action before moving to revision or integration.
If the reviewer finds real issues:
A resend is bounded follow-up work. It is not a restart of the full task and not an invitation to broaden scope. Limit revision to at most two consecutive resend rounds per issue. If an issue survives two rounds, escalate to the developer rather than continuing the loop.
Apply these rules during every revision round to prevent unbounded churn:
When a convergence rule fires, record the trigger, the action taken, and the outcome in the track report's rescue history before moving to the next step.
When a resend or rescue occurs, update the track report's state, revision rounds, rescue history, unresolved issues, and next action so the final track gate has a durable record of what changed.
When tracks are ready:
After merge, update each track report to reflect the final track state (merged, abandoned, blocked, or retained for later work).
After all track work is integrated:
/workflow-orchestration:final-pr-readiness-gate on the stable integrated diff;Before stopping, publish one durable batch summary that includes:
docs/workflow-artifact-templates.md:
discovery-reuse — whether the discovery brief was reused by downstream tracks;rescue-attempts — total rescue attempts across all tracks;abandonment-events — tracks abandoned without resolution;re-review-loops — per-track count of extra revision cycles beyond the initial review."Durable" means written to a repository-appropriate sink using the template shape from docs/workflow-artifact-templates.md — for example, a PR description, a committed document, an issue comment, or a task tracker entry. In this repository, committed workflow artifacts live under docs/; other repositories may use a different durable sink. The batch summary MUST be produced; chat-only memory is not sufficient.
A track is not complete until:
docs/workflow-artifact-templates.md for the template).At this gate (per track, after the track gate passes), write .agent/SESSION.md. Record:
current-task: the overall batch task descriptioncurrent-phase: "track-[N]-merged" (substitute the track number or name)next-action: the next pending track or "run integration gate" if all tracks are mergedworkspace: the integration target branchlast-updated: current ISO-8601 datetime## Decisions: which tracks are merged, which are pending## Files Touched: files merged in this track## Open Questions: any open questions from the track review## Blockers: active blockers (empty if none)## Failed Hypotheses: (empty — not applicable for this skill)If the write fails: log a warning and continue. Do not block track completion.
The batch is not complete until:
docs/workflow-artifact-templates.md for the template).Before declaring the batch complete, confirm ALL of the following. Any failing item blocks the "batch complete" declaration.
Track merge gate (per track)
Integration gate (whole batch)
If any item is FAIL: report the failing item(s) by name, state what must be done to resolve each, and do not advance past the gate.
Before abandoning a stalled track, the coordinator must attempt at least one rescue pass: narrow scope, request a status update, and offer one bounded retry. Only abandon the track if rescue fails or the developer explicitly cancels. When stopping, record partial results, unresolved items, and the reason for stopping. Then reduce concurrency and continue serially with the remaining work.