From long-task
Orchestrates multi-session projects by implementing one feature per cycle from feature-list.json through TDD pipeline with quality gates and code review.
npx claudepluginhub suriyel/longtaskforagent --plugin long-taskThis skill uses the workspace's default tool permissions.
Execute multi-session software projects by implementing one feature per cycle. Each cycle follows a strict pipeline: Orient → Gate → Plan → TDD → Quality → ST Acceptance → Inline Check → Persist.
Routes sessions in long-task projects to the correct phase skill by checking files like bugfix-request.json, feature-list.json, design docs, and codebase state.
Orchestrates full R&D lifecycle from User Story to CD via mandatory 9-step checklist requiring human confirmation per step. Ensures observability plans and E2E test coverage for new features.
Drives AI coding agents through gated Spec → Plan → Build → Test → Review → Ship workflow for non-trivial features, refactors, or multi-file projects.
Share bugs, ideas, or general feedback.
Execute multi-session software projects by implementing one feature per cycle. Each cycle follows a strict pipeline: Orient → Gate → Plan → TDD → Quality → ST Acceptance → Inline Check → Persist.
Announce at start: "I'm using the long-task-work skill. Let me orient myself."
Core principle: Each sub-step has its own skill. Follow the orchestration order exactly.
You MUST create a TodoWrite task for each step and complete them in order:
Load config values if applicable — activate the project environment per long-task-guide.md; if the project uses a file-based config (e.g., .env), ensure it is sourced so required env vars are set before running checks
Read task-progress.md ## Current State section — progress stats, last completed feature, next feature up
Read feature-list.json — note constraints[], assumptions[], required_configs[], feature statuses
Read long-task-guide.md — project-specific workflow guidance
Read env-guide.md (if it exists) — note service names, ports, and health check URLs; required if the target feature has service dependencies
Determine service dependencies: A feature has service dependencies if ANY of the following are true:
required_configs[] entries include connection-string keys (key contains URL, URI, DSN, CONNECTION, HOST, or PORT — e.g., DATABASE_URL, REDIS_HOST)dependencies[] include a feature whose title references database setup, schema migration, or service initialization{design_section}) specifies external service interactions (DB queries, HTTP calls to own services, message queue operations)Record determination (yes/no + which services) in task-progress.md under the current feature heading. This determination drives Bootstrap Step 2 and Config Gate Step 3.
Read design doc Section 1 (docs/plans/*-design.md) — project overview and architecture snapshot for global context
Read design doc §13 (Codebase Conventions & Constraints, if exists) — note 2/3方件 library constraints (§13.1), prohibited APIs (§13.2), static analysis tools (§13.4), naming conventions (§13.5), error handling pattern (§13.6), commit conventions (§13.8). These are binding for all new code.
Run git log --oneline -10 — recent commit context
Pick next "status": "failing" feature by priority, then by array position in features[] (first eligible wins) — skip features with "deprecated": true
Dependency satisfaction check: After selecting a candidate feature, verify that ALL feature IDs in its dependencies[] have "status": "passing" in feature-list.json. If any dependency is still "failing":
"failing" feature (by priority + dependency order) whose dependencies are all satisfiedAskUserQuestion: "All remaining features have unsatisfied dependencies. Circular or over-constrained dependency graph detected." → let user choose which feature to force-start (override dependency check)task-progress.mdIf target feature has "ui": true and UCD document exists (docs/plans/*-ucd.md), read the UCD style guide — reference style tokens, component prompts, and page prompts to ensure frontend implementation matches the approved visual style
Document Lookup Protocol (used by Steps 5, 10, and 11):
When you need the design section or SRS requirement for a feature, do NOT grep for the feature title. Instead:
Design document (docs/plans/*-design.md):
### 4.N Feature:)### 4.N subsection corresponds to the target feature by matching the feature title or FR-ID### 4.N through the line before ### 4.(N+1) (or end of section 4) — this includes Overview, Class Diagram, Sequence Diagram, Flow Diagram, and Design Decisions{design_section} for use in Plan (Step 5) and ST Acceptance (Step 9)SRS document (docs/plans/*-srs.md):
### FR-xxx subsection matching the target feature{srs_section} for use in PlanUCD document (docs/plans/*-ucd.md, only for "ui": true features):
Why this matters: Grep returns isolated matching lines without surrounding context. Design sections contain class diagrams, sequence diagrams, flow diagrams, and design rationale that span dozens of lines — all of which are needed for correct implementation and inline compliance checking.
init.sh / init.ps1 exists and environment is not ready: run it oncetask-progress.md if script was executedlong-task-guide.md and verify the test/coverage/mutation commands are correct for the tech stack; use these directly throughout the cycle (no wrapper scripts)env-guide.md → locate "Verify Services Running" health checkstask-progress.md; proceedenv-guide.md "Start All Services" with output capture:
[start command] > /tmp/svc-<slug>-start.log 2>&1 &
sleep 3
head -30 /tmp/svc-<slug>-start.log
env-guide.md; escalate via AskUserQuestion if unresolvabletask-progress.mdlong-task-guide.md → run test command directly)python scripts/check_configs.py feature-list.json --feature <id>
<id> = the feature ID selected in Step 1. The generated check_configs.py loads config values using the project's native format automatically.
If configs are missing — prompt for text input and save to the project config:
env-type config, use AskUserQuestion to ask the user to type the value — do NOT provide predefined option buttons. Frame the question with the config's name, description, and check_hint so the user knows what to provide.
OPENAI_API_KEY (OpenAI API key for LLM integration). Hint: Get it from https://platform.openai.com/api-keys"file-type config, ask the user to provide the file path or create the file manually.Config Management section in long-task-guide.md for the exact method (e.g., append to .env, set in application.properties, export as system env var).python scripts/check_configs.py feature-list.json --feature <id>
.gitignore if not already present.env-type config whose key matches a connection-string pattern (DATABASE_URL, REDIS_URL, etc.): run the corresponding health check from env-guide.md "Verify Services Running"Config Gate is non-negotiable for features with external dependencies. If configs are missing:
AskUserQuestion to request values from the userrequired_configs[] entries with connection-string keys (URL, HOST, PORT, DSN, URI, CONNECTION, ENDPOINT)check_real_tests.py --require-for-depsREQUIRED SUB-SKILL: Invoke long-task:long-task-feature-design and follow it exactly.
The Feature Design skill dispatches a SubAgent to produce the detailed design document. The main Agent does NOT read design/SRS/UCD document sections or write the design document — the SubAgent handles everything in its own fresh context and returns a structured summary.
For
category: "bugfix"features: feature-design is condensed. The SubAgent focuses on: (1) root cause documentation, (2) targeted fix approach, (3) regression test inventory. Full diagrams are skipped unless the bug directly touches those surfaces.
Context to carry forward (paths only — SubAgent reads contents itself):
quality_gates and tech_stack (compact JSON)docs/plans/*-ats.md (if exists) — SubAgent uses ATS mapping to align Test Inventory categoriesdocs/features/YYYY-MM-DD-<feature-name>.mdOutput: docs/features/YYYY-MM-DD-<feature-name>.md (written by SubAgent) — feature detailed design document containing interface contracts, algorithm pseudocode, diagrams, test inventory, and TDD task decomposition.
Contract deviation handling: If SubAgent returns BLOCKED with an issue containing "Contract deviation":
AskUserQuestion"passing", warn user they may need re-verificationAmbiguity clarification handling: If Feature Design SubAgent returns CLARIFY:
task-progress.md: "SRS gap identified during Feature Design for #{id} — user directed to long-task-increment"increment-request.json to update the SRS before continuing with this feature"REQUIRED SUB-SKILL: Invoke long-task:long-task-tdd and follow it exactly.
Context to carry forward:
quality_gates and tech_stack from feature-list.json{srs_section} from Document Lookup Protocol — TDD Red uses this as specification input alongside Feature Design Test Inventory; verification_steps are optional supplementary input{design_section} from Document Lookup Protocol — architectural constraints and interface contractslong-task-guide.md — use these directly (no wrapper scripts)REQUIRED SUB-SKILL: Invoke long-task:long-task-quality and follow it exactly.
The Quality skill dispatches a SubAgent to execute all 4 gates (Real Test → Coverage → Mutation → Verify). The main Agent does NOT read coverage reports, mutation output, or test runner output — the SubAgent handles everything in its own fresh context and returns a structured summary.
Context to carry forward (minimal — SubAgent reads files itself):
quality_gates thresholds (compact JSON)tech_stack (compact JSON)REQUIRED SUB-SKILL: Invoke long-task:long-task-feature-st and follow it exactly.
Execute black-box acceptance testing for the feature after TDD and quality gates pass. The skill dispatches a SubAgent that reads SRS/Design/UCD/ATS documents in its own fresh context, generates ISO/IEC/IEEE 29119 compliant test case documents, executes test cases, and manages service lifecycle. The main Agent does NOT read document sections, test case content, or execution output — only the structured summary.
Context to carry forward (paths only — SubAgent reads file contents itself):
quality_gates and tech_stack (compact JSON)st_case_template_path and st_case_example_path from feature-list.json root (if set)Output: docs/test-cases/feature-{id}-{slug}.md (written by SubAgent)
Hard Gate:
AskUserQuestionRun these mechanical checks directly — no SubAgent dispatch needed.
Read the feature design document (docs/features/YYYY-MM-DD-<feature-name>.md)
produced in Step 4.
a) Interface contract verification (P2 equivalent): Read §3 Interface Contract table from the feature design doc. For each PUBLIC method listed, grep the implementation files to confirm the method exists with matching signature (name, parameters, return type). Flag missing or mismatched methods.
b) Test Inventory ↔ test file cross-check (T2 equivalent): Read §7 Test Inventory from the feature design doc. For each test row, confirm the corresponding test function exists in the test file:
grep -q "{test_function_name}" {test_file}
If any test function is not found, search for similar names and fix the ST document traceability matrix reference.
c) Design dependency versions (D3 equivalent):
If §3 or §5 specifies third-party library versions, spot-check that
requirements.txt / package.json / pom.xml matches. Flag mismatches.
d) UCD spot check (U1 equivalent, ui:true only): Grep CSS/style files for hardcoded color hex values not in UCD palette tokens.
e) ST document integrity:
Confirm validate_st_cases.py already passed in Feature-ST (Step 9).
No re-validation needed — Feature-ST Step 5b + Step 6 already cover T1.
f) Codebase convention spot-check (advisory, non-blocking — skip if Design §13 absent): Spot-check 2-3 new/modified files against Design doc §13:
task-progress.md. Not a blocking gate — Design doc / framework conventions override scanner observations.If all checks pass → proceed to Persist. If any check fails → fix inline, re-verify. No SubAgent dispatch.
Record in task-progress.md:
- Inline Check: PASS (P2: N/N methods verified, T2: N/N tests found, D3: OK)
Commit format: If Design §13.8 documents commit conventions, follow that format. Otherwise use defaults below. For
category: "bugfix"features: use commit prefix"fix:"instead of"feat:". Format:fix: <feature title without the "Fix: " prefix> (#<fixed_feature_id>)
git rev-parse --short HEAD
Store this value as {commit_sha} — it is used in the next two steps.RELEASE_NOTES.md (Keep a Changelog format)
For
category: "bugfix"features: add entry under### Fixed(not### Added):- [<bug_severity>] <title without "Fix: "> (fixes #<fixed_feature_id>) — <root_cause one-line>
task-progress.md:
## Current State header: progress count (X/Y passing), last completed feature (#id title, date), next feature (#id title)### Feature #id: Title — PASS
- Completed: YYYY-MM-DD
- TDD: green ✓
- Quality Gates: N% line, N% branch, N% mutation
- Feature-ST: N cases, all PASS
- Inline Check: PASS
- Git: {commit_sha} feat: title
#### Risks ← include only if any risks were reported
- ⚠ [Mutant] file:line — reason
- ⚠ [Coverage] metric N% — thin margin / uncovered boundary
- ⚠ [Dependency] lib==ver — known patch / breaking change pending
{commit_sha} must be the actual captured value — never a placeholder. This ensures task-progress.md and feature-list.json carry the same verified SHA.### Risks tables; merge into a single list; append as #### Risks bullets only if the list is non-empty"status": "passing" in feature-list.json"st_case_path", "st_case_count", and "git_sha": "{commit_sha}" on the feature object in feature-list.jsonpython scripts/validate_features.py feature-list.json
docs/report/feature-{id}-{slug}-report.md exists on disk.This step is non-negotiable. Every feature. No exceptions.
Generate a per-feature development report at docs/report/feature-{id}-{slug}-report.md.
Data sources (use in-context data where available; read feature design doc §4 if SRS AC text is needed for Section B — one targeted read is acceptable):
feature-list.json (id, title, category, priority, wave, srs_trace, dependencies, ui)docs/features/YYYY-MM-DD-<feature-name>.md (§3 Interface Contract, §4 SRS Requirement, §7 Test Inventory)Steps:
mkdir -p docs/reportdocs/templates/feature-report-template.md) with session datadocs/report/feature-{id}-{slug}-report.md"report_path" on the feature object in feature-list.jsonReport sections (see template for full structure):
A. Basic Info — Feature metadata, completion date, git SHA.
B. Requirements Consistency Briefing (需求一致性简报) — Read §4 (SRS Requirement) from the feature design doc (docs/features/YYYY-MM-DD-<feature-name>.md). For each srs_trace requirement ID:
C. Quality Gates — Line coverage, branch coverage, mutation score vs thresholds.
D. Real Test Execution Summary (真实测试内容) — From Feature-ST return:
check_real_tests.py was run: marker count, mock warnings, skip patternsE. Risk Assessment with Mitigations (风险与解决办法) — For each risk:
F. Inline Compliance Check — P2, T2, D3, U1 status.
G. Feature-ST Summary — Total cases, pass rate, category breakdown, visual assessment (ui:true).
H. Files Changed — git diff --name-only of the feature commit.
I. Dependencies — Dependency feature IDs with current status.
git add feature-list.json task-progress.md RELEASE_NOTES.md docs/report/feature-{id}-{slug}-report.md
git commit -m "chore: update progress — feature #{id} passing"
If retro_authorized is true in feature-list.json:
skills/long-task-retrospective/prompts/reflection-prompt.mdtask-progress.md entry, any AskUserQuestion exchanges where user corrected skill outputAgent(run_in_background=true) — do NOT wait for completionIf retro_authorized is absent or false → skip entirely (no output, no dispatch).
long-task-feature-st)Feature #<id> (<title>) — DONE
Next: Feature #<next_id> (<next_title>)
All active features passing — next session begins System Testing.
The auto-loop script (scripts/auto_loop.py) handles multi-feature automation externally — each invocation is a fresh context.
scripts/auto_loop.py)references/systematic-debugging.md; trace root cause, never guess-and-fixdocs/report/feature-{id}-{slug}-report.md in Step 11a; set report_path in feature-list.json; include the report file in the progress commit. Every feature, no exceptions.| Rationalization | Correct Action |
|---|---|
| "I'll mock that config later" | Run Config Gate. Real configs needed. |
| "This feature is trivial, skip test cases" | Invoke long-task-feature-st. Every feature. |
| "This feature is trivial, skip TDD" | Invoke long-task-tdd. Every feature. |
| "Tests pass, mark it done" | Invoke long-task-quality first. |
| "Coverage looks close enough" | Thresholds are hard gates. Run the tool. |
| "Let me just try this quick fix" | Systematic debugging first. |
| "I'll generate examples during Worker" | Examples are post-ST via long-task-finalize. |
| "I'll update release notes at the end" | Update after every commit. |
| "Mutation score is probably OK" | Run mutation tests and read the report. |
| "The UI looks correct to me" | Run automated detection + EXPECT/REJECT. |
| "ST test case failed but the code is fine" | No bypass. AI must fix code and re-dispatch — no retry limit. If test spec is wrong, use long-task-increment to modify. Only escalate if issue genuinely requires human manual testing. |
| "Port is busy, let me kill manually" | Use env-guide.md "Stop All Services" (port fallback) to kill it, then restart via env-guide.md Start — update env-guide.md if the command needed correction. |
| "Environment is down, skip ST cases" | BLOCKED, not skipped. Fix environment or ask user. |
| "This deprecated feature still needs work" | Skip it. Deprecated features are excluded. |
| "Backend isn't ready but I'll mock it for now" | Dependency check exists for a reason. Develop backend features first. |
| "I'll skip the dependency check this once" | Never skip. Reorder features so deps are satisfied. |
| "The report can wait / I'll generate it later" | Step 11a is mandatory. Generate the report now — before the final git commit. |
| "The SRS is ambiguous but I'll just assume..." | SubAgent should flag CLARIFY. Assumptions on critical paths (Interface Contract, Test Inventory expected results, cross-feature contracts) cause late-stage rework. Only low-impact ambiguities may be assumed. |
Follow the systematic debugging process — never guess-and-fix:
references/systematic-debugging.md for detailed process)Called by: using-long-task (when feature-list.json exists) or long-task-init (Step 16) Invokes (in strict order):
long-task:long-task-tdd (Steps 5-7) — TDD Red-Green-Refactorlong-task:long-task-quality (Step 8) — Coverage + Mutationlong-task:long-task-feature-st (Step 9) — Black-Box Feature Acceptance Testing (ISO/IEC/IEEE 29119, self-managed lifecycle)
Reads/Writes: feature-list.json, task-progress.md (including ## Current State), RELEASE_NOTES.md
Read on-demand (via Read tool, NOT Skill tool): references/systematic-debugging.md