Govern autonomous development execution through the Explore-Plan-Code-Verify loop, task-driven progress tracking, three-tier action classification, decision journaling, and bounded verification. Use when executing any development workflow autonomously or orchestrating multi-step implementation tasks.
From flownpx claudepluginhub synaptiai/synapti-marketplace --plugin flowThis skill is limited to using the following tools:
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Foundation skill governing how Claude executes development workflows autonomously.
NO SKIPPING PHASES. Explore before Plan, Plan before Code, Code before Verify. Every phase produces an artifact.
Jumping to code without exploration is the #1 cause of rework. Jumping to "done" without verification is the #1 cause of bugs reaching review.
Every multi-step workflow follows this loop:
goToDefinition to trace code paths from issue keywords to implementation, and findReferences to assess the impact of planned changes — this enhances text-based grep searches with semantic understanding.hover to understand types and signatures of existing code before modifying it.lsp.diagnosticsAsQuality), collect them as an additional quality signal — errors are P1, warnings are P2. LSP diagnostics complement, never replace, CLI-based checks.
b. Runtime: Build the project, start it, verify at runtime. If anything fails, enter the debug-fix-retest loop (bounded by closedLoop.maxDebugIterations).
c. Review: Self-review with fix-forward — fix P1/P2 findings immediately, don't just report them.
d. Verdict: Independent judgment — dispatch verdict-judge agent (when verdict.enabled) with acceptance criteria + evidence bundle. The judge has no access to code-writing rationale, diff, or decision journal. It evaluates outcomes, not process. Each criterion receives PASS/FAIL/NEEDS-HUMAN-REVIEW. FAIL verdicts trigger fix loops; NEEDS-HUMAN-REVIEW escalates to user.Use Task tools as first-class workflow primitives:
| Tool | When |
|---|---|
| TaskCreate | Start of PLAN phase — one task per deliverable |
| TaskUpdate(in_progress) | Before starting work on a task |
| TaskUpdate(completed) | After task passes verification |
| TaskList | At checkpoints to confirm progress |
| TaskGet | Before working on a task to get full context |
Tasks have clear subjects (imperative form) and descriptions with acceptance criteria.
| Tier | Actions | Behavior |
|---|---|---|
| Tier 1 (Autonomous) | Commits, branch creation, file edits, staging | Execute without asking. Local and reversible. |
| Tier 2 (Journal) | Push, PR creation, issue assignment | Execute and log to decision journal. Team-visible but recoverable. |
| Tier 3 (Confirm) | Merge, release, force operations | Always require human confirmation. Non-negotiable. |
Tier configuration is in settings.json under tiers. Actions can be promoted (journal→confirm) but never demoted (confirm→journal).
When a command or skill says "use the AskUserQuestion tool", you MUST invoke the AskUserQuestion tool — do not substitute plain text output. The tool provides structured selectable options that plain text cannot replicate. Supply contextual options appropriate to the situation.
{journal-dir}/issue-{N}.md at branch creationJournal dir defaults to .decisions/, configurable in settings.
Dispatch independent operations in a single message:
Quality check loops have max iterations from settings.json:
qualityCheckMaxIterations (default 3), escalate to user| Trigger | Action |
|---|---|
| 3+ verification failures on the same issue | Stop fixing forward. Return to EXPLORE — the problem is architectural. |
| Plan has >10 tasks for a single issue | Decompose the issue first. One PR should not span 10 tasks. |
| EXPLORE phase yields contradictory signals | Stop. Ask the user for clarification before planning. |
| >5 files modified without staging or committing | Stop. What you have should be committable. If not, the tasks are too large. |
The debug-fix-retest loop is mandatory — do NOT report failures and move on, DO fix them yourself.
Minimum verification by project type:
The loop is bounded by closedLoop.maxDebugIterations (default 5). After max iterations, escalate to user — never silently skip.
| Missing | Fallback |
|---|---|
| No agent teams | Single-session sequential |
| No quality commands | Attempt to discover them first (Skill(capability-discovery)), then proceed with runtime verification only |
| No LSP server | Fall back to grep-based references and CLI-only diagnostics. No error — LSP is additive. |
| No gh CLI | Warn, continue with git-only |
| No decision journal | Proceed without logging, note in PR |
| Excuse | Response |
|---|---|
| "I already know what to do, skip EXPLORE" | Then exploring should take 10 seconds. Do it. |
| "The plan is obvious, no need to TaskCreate" | Untracked work is invisible work. Create the tasks. |
| "Just one more fix, then I'll verify" | Verify now. The loop exists because one-more-fix never ends. |
| "This is too simple for the full loop" | Simple tasks, same phases. Just faster. |
| "Runtime verification isn't possible" | It is, for any project that does something. Build it, run it, check it. |
| "Tests pass, so it works" | Tests verify what's tested. Runtime verifies what's real. |
| "I can't start the server" | Fix why. Server startup failure IS a bug. |
| "Self-review is enough" | Self-review checks code quality. The verdict checks requirements. Both are needed. |
| "I wrote the tests, so the criteria are met" | Tests prove the code does what you thought was wanted. The verdict proves it does what was actually wanted. |