Skill

gate-check

Final quality gate before reporting task completion (Gate 3). Fills confidence assessment, runs supplementary verification for <95% items, completes self-check checklist. Writes PASS/FAIL result to subtask.md, then calls autoworker:dispatch for routing.

npx claudepluginhub phj128/autoworker --plugin autoworker

Tool Access

This skill uses the workspace's default tool permissions.

Preview

**Trigger**: Called by autoworker:dispatch when all tests are complete. **Pure assessment skill — does not make routing decisions.**

SKILL.md

Similar Skills

subtask-plan

Complete subtask verification plan: upstream traceability table + L1-L4 test plan + self-check. Call after autoworker:subtask-init. Makes subtask ready for code implementation.

autoworker

completion-gate

Validates AI agent claims against evidence trail in coding workflows. Catches unsubstantiated 'done', 'tests pass', 'fixed' without proof like outputs, diffs, or logs. Auto-triggers on completion keywords.

rune

verification-gate

Enforces evidence-based verification before claiming tasks, features, or PRs complete. Requires pasting test outputs, command runs, and behavioral checks; rejects vague assertions.

claudekit

Stats

Stars13

Forks3

Last CommitMar 24, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

autoworker:gate-check — Pre-Completion Self-Check (Gate 3)

Trigger: Called by autoworker:dispatch when all tests are complete. Pure assessment skill — does not make routing decisions.

Execution Flow

1. Pre-Check

Glob `subtask_*.md` (exclude subtask_template.md) →
  0 found → stop, prompt to create subtask
  1 found → use directly (backward compatible)
  multiple → grep `status:` to filter:
    - Files without status field treated as active (backward compatible)
    - Exactly 1 active → use it
    - 0 active → list all files + status, prompt user to choose
    - >1 active → report anomaly
→ Read → check "Test Results" section

Test results section is empty → FAIL, prompt to complete tests and call autoworker:checkpoint first
Has test results → continue

1.5. Acceptance Criteria Traceability

Read subtask's "Acceptance Criteria" table, check whether each metric was measured in L1-L4 test results:

Metric not measured → that change point's confidence < 95% (Step 3 will trigger supplementary verification)
All metrics have corresponding test results → continue
No acceptance criteria table (legacy subtask format) → skip this step

2. Fill Confidence Assessment Table

In subtask.md's "Confidence Assessment" section, fill in for each change point:

Change point	Test level	Confidence	Verification method	Unverified/Risk

Confidence inference basis:

Has L4 pass + meaningful output → 95%+
Has L2 but no L4 → 70-85%
Only L1 → 50-70%
Untested → 30%

3. Supplementary Verification for < 95% Items

When the table above has < 95% items:

Design supplementary verification commands (specific, executable)
Execute immediately
Fill in results
Update confidence

Record in subtask.md's "< 95% Supplementary Verification" table.

Boundary for "requires user confirmation" — only scenarios depending on human senses qualify:

"Selector may have changed" → launch browser and check — NOT "requires user confirmation"
"API might not work" → curl it — NOT "requires user confirmation"
"Config might be wrong" → write a script to load and check — NOT "requires user confirmation"
UI appearance, interaction feel → legitimate "requires user confirmation"

When all items are >= 95%, write "All >= 95%, no supplementary verification needed".

4. Gate 3 Self-Check Checklist

In subtask.md's "Gate 3 Self-Check" section, check each item and provide evidence:

Verification depth: Test results section has L2+ records
- Evidence:
L4 = user path: L4 operation path = actual user usage path
- Evidence: <state what L4 did, compare with how user would use it>
Supplementary verification complete: All < 95% items had supplementary verification executed
- Evidence: <how many items supplemented, or "all >= 95%">
Instruction file tested: If SKILL.md / config files were changed, they were actually trigger-tested
- Evidence: <how verified, or "no instruction files changed">
Coverage complete: Every modified file has a corresponding verification item
- Evidence: <how many files changed, how many verified>

Can't write evidence = didn't do it = can't check the box = FAIL.

5. Chain Integrity Check

Answer each question (any No → FAIL):

Is the chain fully traced? From config/code change → to user-perceivable effect, is there a verification point at every link?
Any "should be fine" links? Anything you feel "obviously works" but haven't actually run?
Does verification path = user path? Is your verification method consistent with how users actually use it?

6. Write Result and Invoke autoworker:dispatch

PASS (all self-checks pass + all >= 95%):

Edit subtask.md:

Append Gate result: PASS at the end of "Progress Log" section
Change status: active to status: completed (if status field exists)

Output:

Gate 3 PASS
- Confidence: all >= 95%
- Self-check: 5/5 passed
- Chain: complete
→ Invoking autoworker:dispatch

FAIL loop limit: If already consecutively FAIL 2 times (check "Progress Log" for Gate result: FAIL count), on the 3rd FAIL, do not invoke autoworker:subtask-update. Instead output a complete failure report to the user and let them decide next steps.

FAIL (any self-check fails or has < 95% that can't be self-resolved):

Edit subtask.md, append at end of "Progress Log" section:

Gate result: FAIL

Output:

Gate 3 FAIL
- Failed items: <specifics>
- Needs additional work: <specifics>
→ Invoking autoworker:dispatch

Both cases always invoke autoworker:dispatch. No routing decisions — autoworker:dispatch reads Gate result and decides next step.

7. Chain: Immediately Invoke autoworker:dispatch

After outputting the result, immediately invoke autoworker:dispatch. Do not wait for user instructions, do nothing else.

Important Notes

gate-check is pure assessment: Only assesses + writes results, does not make routing decisions
Evidence is a hard constraint: Each self-check item must have traceable evidence — "I think it's fine" doesn't count
Gate result format is fixed: Must be Gate result: PASS or Gate result: FAIL — autoworker:dispatch reads this exact format
Don't ask user "should I supplement?": Autonomously executable supplementary verification goes directly through autoworker:dispatch → autoworker:subtask-update loop