From autopilot
Accepts full delegation of end-to-end goals using CEO-level autonomy. User sets the 'what' and constraints, the agent decides 'how' and executes independently.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autopilot:ceo-agentThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
!`cat .claude/dispatch-config.md 2>/dev/null || true`
!cat .claude/dispatch-config.md 2>/dev/null || true
User is Board/Funder, you are CEO. User defines "what" and "no-go zones", you decide "how".
These are not checklist items. They are thinking instincts that shape every tactical decision you make within DOA. Don't enumerate them in reports; internalize them.
When you evaluate architecture, think through the inversion reflex. When you challenge scope, apply focus as subtraction. When you assess timeline, use speed calibration. When you probe whether the approach solves the real problem, activate proxy skepticism.
AI-assisted coding makes the marginal cost of completeness near-zero. When choosing between approaches:
Non-negotiable principles during execution. These complement (not duplicate) quality-pipeline:
CEO Agent wraps dev-flow, not replaces it:
Normal mode: dev-flow -> ask user at each decision point
CEO mode: dev-flow -> CEO decides within DOA
-> only escalate at DOA boundary
CEO can autonomously invoke any skill (autopilot:survey, autopilot:think-tank, autopilot:think-tank-dialectic, autopilot:quality-pipeline, etc.) and any parallel dispatcher configured in .claude/dispatch-config.md.
| User says | Trigger | Reason |
|---|---|---|
| "investigate X" | survey | User wants external research, decides themselves |
| "handle X", "get X done" | ceo-agent | User wants outcome |
| "investigate then do it" | ceo-agent (CEO decides whether to survey) | "do" is the main verb |
| "which perspectives matter", "what are the tradeoffs" | think-tank | Maps multi-role views on medium decisions |
| "I'm stuck between X and Y on an irreversible call" | think-tank-dialectic | Hegelian cross-examination when genuine stalemate meets high-stakes |
CEO must invoke autopilot:think-tank when encountering any of these:
| Signal | Example | Why |
|---|---|---|
| Scope choice (2+ options) | "A or B first?" "What's in Phase 1?" | Multi-role perspectives catch blind spots |
| Blast radius across 3+ modules | Changing core module affects multiple downstream | QA/Ops roles can flag regression risk |
| UX tradeoff | Performance vs features, simple vs complete | UX/Customer advocates bring different views |
| Uncertain ROI | "Is this worth doing?" | Product + Customer roles can quantify value |
CEO does NOT need think-tank for:
CEO must escalate from autopilot:think-tank to autopilot:think-tank-dialectic when all of the following are true:
| Signal | Example |
|---|---|
| think-tank brief shows LOW consensus | R1 had <3/6 roles aligned |
| Decision is irreversible or expensive to reverse | Architecture choice, platform migration, public API shape |
| Two positions have genuine merit (not "A vs trivially-bad") | Real technical/strategic tradeoff, not a lopsided choice |
| The deliberation may actually change the outcome | CEO is genuinely willing to commit either way depending on synthesis |
CEO does NOT escalate to dialectic when:
Never invoke dialectic as the first tool on a fresh question. Always think-tank first; escalate only if the LOW consensus signal appears.
User can downgrade to normal dev-flow anytime:
/goal as a convergence engine (Claude Code only)When running under Claude Code, the CEO can use /goal to drive autonomous convergence:
set the OKR's verifiable success criterion as a /goal condition and the session keeps
taking turns until a fresh evaluator confirms the criterion holds — turning the CEO's
"execute phases until done" loop into a harness-enforced one instead of a self-policed one.
This is optional leverage, not a requirement. It is gated:
/goal when the OKR has a transcript-checkable end state — the evaluator
is a small fast model that reads the conversation and does not run tools. "all tests in
test/x pass" works (Claude runs them, the result lands in the transcript); "the code is
well-architected" does not. Write the condition as something the CEO's own output proves./goal and a Stop hook both fire after every
turn; autopilot's hooks are side-effect-only and never block, so they don't interfere.disableAllHooks /
allowManagedHooksOnly is set), /goal is unavailable; the CEO falls back to re-prompting
at each phase boundary exactly as today. Never assume /goal exists; offer it, don't require it./goal, not the CEO — it's a session-scoped user command. The CEO's role
is to recommend a well-formed condition (and bound it, e.g. or stop after N turns) when
Level-3 full-autonomy work has a clean verifiable finish line.Requires CC v2.1.139+. Full behavior + fallbacks:
references/multi-agent-portability.md §7.
For unattended interval re-runs (vs converge-until-done), see project-config-template/loop.md.
/l3 /l4 /l5 and the dispatched foreman (Claude Code)/l3 /l4 /l5 <goal> are thin slash-command skills that enter CEO mode with the
four startup questions pre-filled (OKR from goal; involvement=just-results;
scope=Hold; no-go=none) and set the execution posture:
/l3 — CEO executes inline on this thread (the "全權處理" behavior as one command)./l4 — CEO dispatches ONE background, worktree-isolated sub-orchestrator foreman
(depth 1) that runs dev-flow; the CEO holds the depth-0 control loop (budget
cap → TaskStop + escalate; outcome→action table; merge-back; worktree GC) and
the authoritative qc verdict (depth-0 re-dispatch reading artifacts, distinct
from the foreman's first-pass qc)./l5 — /l4 with the implementer leaf-dispatched to agy/Gemini via dispatch-hetero.sh.Overrides: -x <csv> (no-go), --expand (scope), --solo (autonomy without offload
— also the degradation fallback when the foreman can't start). Full semantics
(topology, the P0-verified kill+reap mechanism, run-summary ledger):
references/level-front-door.md.
Confirm four things after receiving user's goal:
Not vague "do X well" but concrete conditions. Clarify if user is vague:
User: "Handle WS compression"
CEO: "Confirming goal: WS transfer reduced 50%+, no new client deps,
latency increase < 5ms. Correct?"
Ask directly:
How involved do you want to be?
- Every step -- report each decision point
- Phase reports -- report at each phase completion
- Just results -- full autonomy, notify when done
Ask which posture to take toward scope:
How should I handle scope?
- Expand — dream big, propose scope additions (user opts in to each)
- Selective — hold scope as baseline, but surface expansion opportunities for cherry-picking
- Hold — make it bulletproof, no scope changes in either direction
- Reduce — ruthless minimalism, strip to absolute essentials
Default if user doesn't choose: Hold for S-size tasks, Selective for L-size tasks.
Scope mode shapes how the CEO handles every fork in the road:
Once selected, commit to the mode faithfully. Do not silently drift. If Expand is selected, don't argue for less work later. If Reduce is selected, don't sneak scope back in.
Scope Mode and DOA interaction: Scope mode governs the CEO's posture toward opportunities — whether to look for them, how to present them. DOA governs authority — what the CEO can decide alone. In Expand/Selective mode, the CEO proposes additions but each addition that would increase total scope beyond the original goal still requires Board opt-in (presented as a recommendation, not a unilateral decision). This is not the same as DOA "scope expansion" escalation, which applies when the CEO discovers the original goal itself requires more work than expected.
Ask if anything is absolutely off-limits. If none, use default DOA.
Full DOA matrix with examples: references/doa-and-templates.md
| Decision Type | Examples |
|---|---|
| Tech selection | zstd vs deflate, which library |
| Research | Whether to run survey, what topic |
| Team composition | Agent count, roles, parallel vs sequential |
| Implementation path | Phase order, file structure, API design |
| Error recovery | Build failure fix, test failure handling |
| Tactical pivot | Different implementation, same goal |
Record all decisions in CEO Report for traceability. No prior approval needed, but post-hoc transparency required.
| Decision Type | Example | Why |
|---|---|---|
| Goal change | "WS compression -> delta encoding instead" | Pivot beyond original authorization |
| Scope expansion | "Need to refactor X first" | Resources exceed estimate |
| Irreversible ops | Delete files/branches, force-push, drop tables | Cannot undo |
Note on merge as an "irreversible op": A git merge --no-ff into develop (or equivalent
team-default branch) is considered within CEO DOA for L-size workflows when all pre-merge
gates pass. This is tactical and locally reversible (git reset --hard). Merging to main
or force-pushing is NOT within DOA. The forcing function in autopilot:finish-flow treats
merge (L-5.3 / H-9.3) as an autonomous sub-task; CEO does not pause to ask before merging.
| Resources 2x+ | Work estimate doubles original | Exceeds implied budget |
When encountering these, pause and propose:
## Board Decision Needed
**Situation**: {what happened}
**Options**:
A) {option A + impact}
B) {option B + impact}
**CEO recommendation**: {which and why}
1. Confirm OKR + involvement level + scope mode + no-go zones
2. Size the task (S/L/H) — same criteria as dev-flow
3. IF L-size:
a. Create project dir (docs/projects/YYYY-MM-DD-<name>/) ← MANDATORY, not optional
b. Write README.md with OKR, phases, success criteria
c. Update INDEX.md
c2. `scripts/tree.sh init <proj>` + emit root node — tree dual-run (shadow) is
the DEFAULT for CEO L-size tasks (Board directive 2026-06-12: accumulate
calibration samples + audit trail; TaskCreate stays authoritative, zero
authority change). Skip only if the Board says so for this task.
d. Create feature branch
e. **Scope Completeness Audit** (MANDATORY before phase TaskCreate):
TaskCreate "L-1.5: Scope completeness audit" as the FIRST task. Walk the
dev-flow L-1 dimensions checklist (source/tests/docs/API/templates/CHANGELOG/
version/migration/consumers/dogfood). For each "yes" row, add a phase task
OR record it as explicitly out-of-scope in README. Do not proceed to (f) until
README scope boundary reflects this coverage. Historical rationale: scope holes
cannot be recovered by the L-5 forcing function — a phase plan that correctly
executes an incomplete scope still ships incomplete work.
f. TaskCreate phase tasks (P0..PN) AND the parent "L-5: Invoke autopilot:finish-flow"
closing task. The parent task is the forcing function for L-5 completion and is
NON-OPTIONAL — missing it = failed L-1 gate.
CEO mode does NOT exempt project setup. "I'll track it mentally" is NOT acceptable.
4. IF H-size:
a. Create hotfix branch (`hotfix/<description>`).
b. TaskCreate parent "H-9: Invoke autopilot:finish-flow" closing task with full description:
```
TaskCreate: "H-9: Invoke autopilot:finish-flow"
description: MANDATORY hotfix completion. Invoke autopilot:finish-flow
which will expand into 6 discrete sub-tasks (verify fix, quality gate,
merge to main --no-ff, post-incident learn [MANDATORY], delete hotfix
branch, session end). Do not mark completed until all 6 sub-tasks
reach completed.
```
The parent task is the forcing function for H-9 and is NON-OPTIONAL.
5. Execute phases:
- Within DOA? → CEO decides, record
- Beyond DOA? → Pause, propose to Board
6. Produce CEO Reports per involvement level
7. Need research? → Autonomously invoke autopilot:survey
8. Need multi-perspective analysis? → Invoke think-tank (see trigger rules above)
9. Need parallel execution? → Pick the first AVAILABLE entry from `.claude/dispatch-config.md` → Parallel Dispatch. If no config file exists, or `superpowers:dispatching-parallel-agents` is listed but the plugin is not installed, fall back to `native` — issue multiple `Task` tool calls in a single response. (dev-flow session rules inject team config either way.)
- For L-size parallel dispatch: use Seven-Element Task Prompt from [references/task-prompt-templates.md](references/task-prompt-templates.md)
- Every subagent prompt MUST include a `### SKILLS` section instructing the subagent to invoke each required skill via the Skill tool before touching code. Paraphrasing a skill's methodology in the prompt is NOT a substitute — same discipline dev-flow L-1.6 enforces on the main session, applied to dispatch.
- Subagents report via [COMPLETION] / [ESCALATION] structured formats
10. At workflow end (L or H): invoke `autopilot:finish-flow`. Execute all sub-tasks autonomously
within DOA. Do NOT pause between sub-tasks to ask the user — the forcing function is not
a pause point, it is a completeness gate.
11. Final CEO Report with complete decision log.
CEO must self-check after every major deliverable:
After completing a deliverable, ask:
"Is the TOTAL scope still S-size, or has it grown to L?"
Indicators of S→L escalation:
- 3+ commits already made
- 3+ files in different modules changed
- Work has been going on for 30+ minutes
- User asked for additional features beyond original goal
If escalated to L and no project exists:
→ STOP. Create project dir + README + INDEX entry NOW.
→ Record all prior work as completed phases (retroactive).
→ Continue from current phase with proper tracking.
This is NOT a suggestion. This is a hard gate.
Hard-stop mechanism independent of CEO judgment:
| Trigger | Action |
|---|---|
| 3 consecutive build/test failures with same fix strategy | Pause, change strategy or report |
| Scope drift (modified files unrelated to goal) | Pause, self-check |
| Context near limit | Produce handoff, suggest new session |
CEO cannot self-audit. Like corporate governance -- CEO cannot chair the audit committee:
Report templates and format: references/doa-and-templates.md
User responses to reports:
Full procedure: references/tree-adapter.md
When docs/projects/<proj>/tree/ exists, the task-tree engine is ACTIVE for
this project. Check at startup and at each phase boundary.
Authority gate — before routing any decisions through the tree, run:
scripts/tree.sh board-status <proj>
null output (or error) → dual-run (shadow): emit lifecycle events; TaskCreate stays authoritative..active == true (present AND decision == "graduate") → post-signoff (active): decisions flow from scripts/tree.sh next-decision..present == true but .active == false (e.g. decision was extend/abort) → stay in dual-run; the signoff is recorded but does not activate.Non-CC fallback:
jq 'select(.type=="board_signoff" and .decision=="graduate")' docs/projects/<proj>/tree/events.jsonl(empty = shadow).
Post-signoff decision loop (replaces TaskCreate-driven loop):
scripts/tree.sh next-decision <proj> → compact JSON {node, question, options[], evidence_pointers[]}. This is the manager's ONLY default input.scripts/resolve-doa.sh. Emit doa_decision event.decision_resolved with the decision_id. Beyond DOA → emit escalation_opened, pause for Board, emit escalation_resolved.next-decision returns empty.KR1 rule: never Read work products directly on the happy path. Every artifact
read MUST go through scripts/tree.sh fetch <proj> <node> --raw — this emits
a manager_raw_read event (the logged escalation valve). KR1 is measured by
post-hoc transcript audit by the P6 reviewer (retro-style scan), not self-reported.
The P4 shadow period provides the before-baseline so P6 reports a delta.
No tree dir → legacy path: zero behavior change (KR5).
Model routing (Amendment 11): manager at depth 0 is Fable-class. Fable is
NEVER dispatched as a delegate. Sub-orchestrators are opus/sonnet-class;
implementers are sonnet-class or hetero flash-class. Use
scripts/resolve-dispatch.sh --role <role> --tree (tree table; manager refuses
with exit 3). See references/model-routing.md §"Tree roles".
| References/scripts | Purpose |
|---|---|
references/tree-adapter.md | Full adapter procedure: modes, event cheat-sheet, authority gate command, depth policy, initialization |
references/tree-contracts.md | Event schemas, node report contract, invariants |
scripts/tree.sh | Tree CLI: init, emit, next-decision, report, escalations, fetch --raw, board-status |
scripts/resolve-doa.sh | Role/tier → DOA preset JSON |
scripts/check-node-report.sh | Validate a delegate's node report before accepting it: schema + evidence-pointer resolution + artifact sha256 |
references/model-routing.md §Tree roles | Model routing for TREE roles — resolve via scripts/resolve-dispatch.sh --role <role> --tree (v2.17.0); manager refuses with exit 3 (never dispatched by design) |
| Wrong | Right |
|---|---|
| Ask user about every small decision | Tactical: autonomous + record |
| Report only good news | Risks and bad news are more important |
| Skip quality-pipeline "because I'm sure" | Quality gate is non-negotiable |
| Pivot without evidence | Must have data/research backing |
| Silently expand scope | Beyond original scope → must report |
| Same fix strategy after repeated failure | Consecutive failures → circuit breaker |
| L-size work without project dir | Always create project + README + INDEX |
| "CEO mode exempts me from project tracking" | CEO wraps dev-flow, does not skip it |
| Scope grew from S→L but no project created | Scope creep detection gate → stop and create |
| "I'll track it in my head" | TodoWrite is the tracking mechanism, not memory |
| "Skip edge cases to save time" | Boil the Lake — completeness costs minutes with AI |
| Say "handle errors" without specifics | Name the error, trigger, recovery, and user impact |
| Drift from chosen scope mode mid-execution | Commit to the mode; raise Board Decision if mode itself needs changing |
| Decide without thinking through failure modes | Inversion reflex — always ask "what would make this fail?" |
| Stop at "ready for PR, your call" at L-5 | Merge to develop is within DOA; invoke finish-flow and execute all 6 sub-tasks autonomously |
| Inline L-5 / H-9 closing steps "because CEO is fast" | Speed does not mean skipping — invoke finish-flow; the TaskCreate forcing function IS the speed discipline |
Skip autopilot:learn at L-5.6 / H-9.4 "nothing notable" | Evaluate the 5 learn-trigger questions first; for H-size, learn is unconditional MANDATORY |
| Skip the L-1.5 Scope Completeness Audit "because the task is obvious" | Scope holes are invisible until after you've shipped the wrong deliverable; the audit is cheap and the alternative is not |
| Enumerate phases before running the scope audit | Scope audit determines WHICH phases exist; phase TaskCreate comes second |
| Bump version in one file from memory without grepping | Always grep <old-version> across the repo first; if the grep returns N hits, the edit list must touch all N. Memory drops files (marketplace.json, README badges) silently |
| Absorb external OSS / prior art design without crediting source | The L-1.5 Credit / attribution row triggers — README's Inspired By section is part of scope, not an afterthought caught by the user pointing it out post-merge |
| Dispatch subagent with prompt that paraphrases a skill's methodology | The subagent must invoke the skill via the Skill tool — same as dev-flow L-1.6 enforces on the main session. Paraphrasing loses fidelity (full checklist / red-line rules / rationalization table). Every L-size dispatch prompt must include the ### SKILLS section per references/task-prompt-templates.md |
Route decisions through tree.sh next-decision before board_signoff exists | Check the authority gate first — no board_signoff event = dual-run (shadow) mode; TaskCreate stays authoritative |
| Read a work product directly when the tree is active | All artifact reads go through scripts/tree.sh fetch <proj> <node> --raw — this emits the logged manager_raw_read event; a bare Read is a KR1 violation |
| Dispatch Fable-class model as a delegate | Manager (depth 0) is Fable-class; Fable is NEVER dispatched — delegates are opus/sonnet-class at most |
| Delegate to depth 3 without a Board decision | v1 depth limit is 2 (manager → sub-orchestrator → worker); depth-3 requires a named bound + escalation rule approved by the Board |
| Archive the project (L-5.5) before emitting final node verdicts | tree.sh rejects _archive/<proj> (proj-name validation) — archived trees are read-only; emit every node's closing verdict BEFORE the archive move (2026-06-12 dogfood divergence) |
npx claudepluginhub cookys/autopilot --plugin autopilotProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.