From autoworker
Auto-loop execution workflow with quality gates. Use when starting any non-trivial implementation task. Provides automatic task decomposition, code implementation, testing (L1-L4), and iterative quality gates until completion. Invoke with /autoworker.
npx claudepluginhub phj128/autoworker --plugin autoworkerThis skill is limited to using the following tools:
> **Two paths**: Plan Mode (discussion) → /clear → Execution (implementation).
Executes tasks from TASK_N.md files or free-form descriptions, auto-generating missing scope, success criteria, and verification plans via /generate-tasks before implementation.
Executes implementation plans from cm-planning using modes like batch with checkpoints, subagent-per-task, or parallel dispatch. Audits technologies, installs missing skills, and ensures tests pass.
Orchestrates multi-phase projects by dispatching isolated phase-agents for planning, execution, verification, integration, and review. Tracks state and chains artifacts; use after approving specs with 2+ phases.
Share bugs, ideas, or general feedback.
Two paths: Plan Mode (discussion) → /clear → Execution (implementation). /clear is the boundary. Different paths, completely different behaviors.
The hard standard for "tested": Actually execute commands and observe output. The following do NOT count as tested:
grep confirming content exists — only confirms it was written, not that it worksbash -n syntax check — only confirms no syntax errors, not that logic is correctRead viewing file content — only confirms text is correct, not that execution results are correctCounter-intuitive principle: The smaller the change, the easier it is to skip verification — but verification cost is equally low, so there's no reason to skip.
Special note: Instruction file refactoring (SKILL.md / prompt template structural changes) is not a "documentation task" — it is the highest-risk change type (silent failure, affects all subsequent sessions) and must be fully verified.
Trigger: Receive a non-trivial task → EnterPlanMode (not typo fixes or single-line additions).
After entering Plan Mode, immediately invoke autoworker:deep-plan.
autoworker:deep-plan ensures discussion depth through 5 structured phases:
| Phase | What | Depth Gate |
|---|---|---|
| 1. Motivation Exploration | Continuously ask why, challenge "is this really needed" | Motivation expressible in 1-3 clear sentences |
| 2. Assumption Challenge | List implicit assumptions, challenge each one | Each assumption has verification method or flagged risk |
| 3. Solution Derivation | Derive solution from motivation, compare alternatives, 4-question review | User makes explicit choice with reasoning |
| 4. Acceptance Criteria | Discuss quantitative/behavioral metrics separately | Each metric can become an L4 test case |
| 5. Plan Output | Consolidate into plan file (fixed format for subtask-init extraction) | 95% confidence self-check passes |
Forbidden: Skipping deep-plan and jumping straight to a solution. Plan depth determines the quality ceiling of the execution chain.
→ After autoworker:deep-plan completes, call ExitPlanMode. Then /clear to enter execution session.
Trigger: After /clear or new session, you see the plan produced by Plan Mode (injected via system context).
Key insight: This plan is the product of thorough discussion with the user in the previous session. Goals, scope, success criteria, and verification methods have already been confirmed. No need to ask confirmation questions. The first action is to invoke
autoworker:subtask-initto create subtask.md from the plan and start the execution chain. Do not investigate before creating subtask — investigation is part of assumption verification insideautoworker:subtask-init, not a prerequisite.Verified failure mode: Claude sees plan → wants to "investigate the current state first" → finishes investigating and starts coding directly → subtask.md never created → no execution chain constraints → no gate-check quality gate.
Core idea: The execution chain is not linear — it's a self-iteration loop. Write code → test → check → find gaps → update plan → write more code → test again... until quality meets the bar, then deliver to user. Each step is enforced by skill chaining, leaving no room to skip steps.
Execution chain pseudo-code:
autoworker:subtask-init
Pause old active subtask → Write acceptance criteria + status: active
→ autoworker:subtask-plan
autoworker:subtask-plan
Multi-subtask positioning (active) → silent failure analysis → acceptance coverage check
→ autoworker:dispatch
while autoworker:dispatch (multi-subtask positioning): # Re-read active subtask each time
match state:
has incomplete Phase → autoworker:code → autoworker:checkpoint → continue
has untested layer → autoworker:test → autoworker:checkpoint → continue
all tests complete → autoworker:gate-check (acceptance traceability → PASS/FAIL) → continue
Gate = FAIL → autoworker:subtask-update → continue
Gate = PASS → status: completed → output completion report → break
| Skill | Responsibility | Chains To |
|---|---|---|
autoworker:subtask-init | Persist goals + acceptance criteria + assumptions, pause old active, run assumption verification | → autoworker:subtask-plan |
autoworker:subtask-plan | Silent failure analysis + traceability table + L1-L4 verification plan + coverage check + solution self-check | → autoworker:dispatch |
autoworker:dispatch | Multi-subtask positioning (active), read checkbox state, route (sole routing point) | → dynamic |
autoworker:code | Implement one Phase of code | → autoworker:checkpoint |
autoworker:test | Execute one test layer | → autoworker:checkpoint |
autoworker:checkpoint | Record keeping (check off Phase / write test results) | → autoworker:dispatch |
autoworker:gate-check | Acceptance criteria traceability + confidence self-assessment + supplementary verification + self-check. Sets completed on PASS | → autoworker:dispatch |
autoworker:subtask-update | Add/correct subtask items | → autoworker:dispatch |
Hard rules:
autoworker:checkpoint (invoke the skill, do not manually edit subtask).autoworker:gate-check PASS, you cannot claim "done".| Layer | What It Verifies | Example |
|---|---|---|
| L1 Build | Compilation/type check passes | pnpm build, bash -n *.sh, python -m py_compile |
| L2 Unit | Individual function/module logic is correct | Specific function call + expected output |
| L3 Chain | Multi-module collaboration, correct data flow | Feed downstream with actual upstream output, no hand-written simplified data |
| L4 End-to-End | Complete user path, from input to final effect | Simulate actual user operation path, no skipping steps |
L4 is mandatory. L2/L3 can be skipped but must state justification.
Trigger: User says "change to X", "also need to consider Y", "wrong direction", etc.
Do NOT immediately Edit/Write. First confirm:
High-frequency failure mode: User says "change to Y" during execution → immediately Edit → misunderstanding → rework. Root cause: Treating user feedback as explicit instruction rather than the start of a discussion.
Only change what the user asked for. No opportunistic refactoring, adding comments, renaming variables, or reformatting. Every line in the diff must map to a user requirement.
try/catch: catch must have explicit recovery logic and specific error types. For Python: no bare except Exception:.
No masking missing values with defaults: Do not use dict.get(key, None), getattr(obj, attr, None) to avoid errors. Only two legal patterns:
dict[key] / obj.attr (should raise if missing)if key in dict / hasattr check before access (clear intent)Same approach fails consecutively twice → stop retrying, enter diagnostic mode.
"Same approach" test: If modifications are based on the same unverified assumption, it's the same approach even if the code differs.
Diagnostic three steps:
Counter-intuitive behavior confirmed by diagnosis → write to findings.md.
After modifying SKILL.md or prompt templates, you must re-run affected workflows. Instruction files don't throw errors — only actual execution can verify the effect.
Approach: Delete old output → re-run affected workflow → grep to confirm changes took effect.
L4 for instruction files: Simulate a fresh session walking the actual user path. Reading file content and thinking "looks right" ≠ verified.
/clear wipes all conversation context. Disk files (subtask, progress, plan) survive, but discussion conclusions that exist only in context are permanently lost.
Claude feels context is too long → suggests /clear → user does it → new session remembers nothing → previous plan discussion completely wasted. Root cause: discussion conclusions were only in context, never written to files.
When starting a new session after /clear:
Navigation rule: When exploring directories, check CLAUDE.md index first, then Glob/Grep. CLAUDE.md is a semantic index with higher information density than a file listing.
Layered CLAUDE.md: Project-level has structure tree + module index; subdirectory-level has file list + purpose.
File tracking:
| File | Purpose |
|---|---|
subtask_*.md | Per-task work document (created by autoworker:subtask-init from plan) |
task_plan.md | Project-level plan (big picture) |
progress.md | Project-level progress tracking |
findings.md | Discoveries and counter-intuitive behaviors |
Archive structure (claude_docs/):
| Subdirectory | Content |
|---|---|
subtask/ | Archived subtasks that passed gate-check |
debug_log/ | Debug archives |
reference/ | Research notes, architecture analysis, non-subtask documents |
Preventing stuck agents: Prompt must contain three elements (all required):
Progressive search: Glob/Grep → Explore quick → medium → very thorough. Do not start with very thorough.
Cross-directory: Explicitly state project root path and working directory path in the prompt.
Long output analysis: Short (< 50 lines) — read directly; long logs — use a sub-agent to analyze and return conclusions.
Consult these files for detailed examples and methodology when needed:
| File | When to Read |
|---|---|
references/verification_system.md | When designing verification plans or assumption checks |
references/debug_methodology.md | When hitting 2 consecutive failures and entering diagnostic mode |
references/file_conventions.md | When setting up project file structure or archiving subtasks |
references/proxy_metrics.md | When designing acceptance metrics or proxy indicators |
Mandatory first step after /clear or new session (before any investigation, reading code, or writing code):
Core principle: The user's current message intent > stale file state on disk.
Mandatory flow after /clear (in order, no skipping):
1. First determine the user's current message intent (highest priority):
a. User message contains an explicit new execution task (has plan, has specific requirements, "Implement X")
→ autoworker:subtask-init (create new subtask.md, start new execution chain)
→ Old subtask files don't affect the new task (they're work documents for different tasks)
b. User has no new task (empty message, pure question, says "continue", etc.)
→ continue to step 2
2. Glob subtask_*.md
→ Exists and has in-progress task → autoworker:dispatch (resume execution chain)
→ Does not exist → normal conversation
Forbidden: Skipping this flow to jump straight into investigation or coding.
Verified failure modes: