From agent-atelier
Start the autonomous development loop -- spawn the agent team, begin the orchestration state machine, and drive work items through the full lifecycle from DISCOVER to DONE. Use when the user says 'run', 'start', 'begin', 'launch the loop', 'start the team', 'run the development loop', 'go', 'pick up where we left off', or 'kick it off'. This is the entry point for the entire agent-atelier system.
npx claudepluginhub ether-moon/agent-atelier --plugin agent-atelierThis skill uses the workspace's default tool permissions.
Starts the full autonomous development loop. Spawns the agent team, reads orchestration state, and drives work items through the development lifecycle.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Checks Next.js compilation errors using a running Turbopack dev server after code edits. Fixes actionable issues before reporting complete. Replaces `next build`.
Guides code writing, review, and refactoring with Karpathy-inspired rules to avoid overcomplication, ensure simplicity, surgical changes, and verifiable success criteria.
Share bugs, ideas, or general feedback.
Starts the full autonomous development loop. Spawns the agent team, reads orchestration state, and drives work items through the development lifecycle.
/agent-atelier:init)/agent-atelier:init)docs/product/behavior-spec.mdRead, Write, Bash, Glob, Agent (for team spawning), CronCreate, CronDelete, CronList
/agent-atelier:run # Start or resume the loop
/agent-atelier:run --mode IMPLEMENT # Resume directly into IMPLEMENT phase
On a clean start, the loop begins at DISCOVER and drives through to DONE. On cold resume after a crash, the loop reads persisted state and picks up where it left off -- stranded work is reclaimed, monitors are recreated, and the startup dashboard shows recovered state.
.agent-atelier/loop-state.json, .agent-atelier/work-items.json, .agent-atelier/watchdog-jobs.json..agent-atelier/.pending-tx.json exists, replay it first (see ../../references/recovery-protocol.md).implementing leases from a crashed runtime are reclaimed later by the startup resume sweep.Create one flat team and spawn teammates. Full details in reference/team-lifecycle.md.
Summary: Derive a deterministic team name from the repo root, clean up stale teams, then spawn the three always-on core teammates (State Manager, PM, Architect) from .claude/agents/ definitions. The Orchestrator role is played by the lead agent. Conditional specialists (Builders, VRM, reviewers) are spawned on-demand as work progresses.
After the team is running:
/agent-atelier:monitors spawn*/2 * * * * cron to invoke /agent-atelier:monitors check.*/15 * * * * cron to invoke /agent-atelier:watchdog tick + resume sweep.Scan all WIs with status implementing whose owner is unreachable. For each, requeue with reason cold-resume: owner session unavailable. This reclaims stranded work from a previous crashed runtime without waiting for a watchdog tick. Present the startup dashboard only after the sweep completes so the user sees recovered state.
Drive work items through phases stored in loop-state.json.mode. Full phase details, transition rules, and review findings schema in reference/state-machine.md.
Phase summary:
| Phase | Actors | What Happens |
|---|---|---|
| DISCOVER | Orchestrator, PM | PM reviews behavior spec, identifies gaps |
| SPEC_DRAFT | PM, Architect | PM drafts verifiable behaviors |
| SPEC_HARDEN | PM, Architect | Mutual auditing until spec is stable |
| BUILD_PLAN | Architect | Decompose spec into work items (wi upsert) |
| IMPLEMENT | Builder(s) | Claim WIs, implement, produce candidates |
| VALIDATE | VRM | Validate candidate with evidence bundle |
| REVIEW_SYNTHESIS | QA, UX, PM | Independent review, PM synthesizes findings |
| AUTOFIX | Builder(s) | Fix review bugs, produce new candidate |
| DONE | Orchestrator | Cleanup team, report results, recommend next step |
Key rules:
verify.length >= 1 and non-null complexityTwo concurrent monitoring mechanisms run alongside the state machine loop:
Monitor polling (every ~2 min): Invokes /agent-atelier:monitors check. Handles heartbeat warnings, gate resolutions, CI status events, and branch divergence alerts. Re-spawns dead monitors (escalates to user after 3 crashes). Silent when nothing to report.
Watchdog recovery (every ~15 min): Invokes /agent-atelier:watchdog tick for mechanical recovery (stale leases, expired candidates, budget enforcement), then runs an Orchestrator resume sweep (respawn missing teammates, dispatch Builders, requeue unreachable owners). Silent when nothing to recover.
CI monitor (on-demand): Spawned when entering VALIDATE with a CI run. Self-terminates on terminal CI state. Events picked up by monitor polling. On ci_status (success) → evaluate fast-track, then transition to IMPLEMENT or REVIEW_SYNTHESIS.
gate resolve| Code | Meaning |
|---|---|
0 | Loop completed -- all WIs done |
1 | Usage error |
2 | Loop interrupted -- user requested stop |
4 | Runtime failure |
Returns JSON to stdout on completion:
{
"completed": true,
"work_items_done": 5,
"work_items_total": 5,
"human_gates_resolved": 2,
"validation_runs": 7,
"mode": "DONE",
"recommended_next": "create_pr",
"issues": []
}
recommended_next values: "run_validation" (missing evidence), "create_pr" (unmerged feature branch), "check_ci" (PR exists, CI unknown), null (everything clean).
issues -- array of strings describing validation gaps or warnings. Empty when clean.
| Scenario | Recovery |
|---|---|
| Teammate crashes | Watchdog detects stale lease, requeues mechanically |
| Loop stuck | Budget checks flag before it becomes a problem |
| WI fails 3x (same fingerprint) | Escalate to human review |
| User interrupts | Save state, requeue active work, stop monitors, cancel cron jobs, report status |
| Monitor crashes | Polling detects dead monitor, Orchestrator re-spawns |
| Monitor crashes 3+ times | Escalate to user instead of retrying |
| Rate limit stalls team | Next watchdog pulse re-runs recovery and resume sweep |
| Lead dies before cron exists | Cold resume via ../../references/recovery-protocol.md, then /run recreates infrastructure |
../../references/success-metrics-routing.md)../../references/recovery-protocol.md