Run human acceptance testing on completed phase work. Presents CHECKPOINT prompts one at a time.
Conducts human acceptance testing by presenting individual checkpoints for user verification of completed work.
/plugin marketplace add yidakee/vibe-better-with-claude-code-vbw/plugin install vbw@vbw-marketplace[phase-number] [--resume]Working directory:
!`pwd`
Plugin root:
!`VBW_CACHE_ROOT="${CLAUDE_CONFIG_DIR:-$HOME/.claude}/plugins/cache/vbw-marketplace/vbw"; R=""; if [ -n "${CLAUDE_PLUGIN_ROOT:-}" ] && [ -f "${CLAUDE_PLUGIN_ROOT}/scripts/hook-wrapper.sh" ]; then R="${CLAUDE_PLUGIN_ROOT}"; fi; if [ -z "$R" ] && [ -f "${VBW_CACHE_ROOT}/local/scripts/hook-wrapper.sh" ]; then R="${VBW_CACHE_ROOT}/local"; fi; if [ -z "$R" ]; then V=$(find "${VBW_CACHE_ROOT}" -maxdepth 1 -mindepth 1 2>/dev/null | awk -F/ '{print $NF}' | grep -E '^[0-9]+(\.[0-9]+)*$' | sort -t. -k1,1n -k2,2n -k3,3n | tail -1); [ -n "$V" ] && [ -f "${VBW_CACHE_ROOT}/${V}/scripts/hook-wrapper.sh" ] && R="${VBW_CACHE_ROOT}/${V}"; fi; if [ -z "$R" ]; then L=$(find "${VBW_CACHE_ROOT}" -maxdepth 1 -mindepth 1 2>/dev/null | awk -F/ '{print $NF}' | sort | tail -1); [ -n "$L" ] && [ -f "${VBW_CACHE_ROOT}/${L}/scripts/hook-wrapper.sh" ] && R="${VBW_CACHE_ROOT}/${L}"; fi; if [ -z "$R" ]; then for f in /tmp/.vbw-plugin-root-link-*/scripts/hook-wrapper.sh; do [ -f "$f" ] && R="${f%/scripts/hook-wrapper.sh}" && break; done; fi; if [ -z "$R" ]; then D=$(ps axww -o args= 2>/dev/null | grep -v grep | grep -oE -- "--plugin-dir [^ ]+" | head -1); D="${D#--plugin-dir }"; [ -n "$D" ] && [ -f "$D/scripts/hook-wrapper.sh" ] && R="$D"; fi; if [ -z "$R" ] || [ ! -d "$R" ]; then echo "VBW: plugin root resolution failed" >&2; exit 1; fi; SESSION_KEY="${CLAUDE_SESSION_ID:-default}"; LINK="/tmp/.vbw-plugin-root-link-${SESSION_KEY}"; REAL_R=$(cd "$R" 2>/dev/null && pwd -P) || REAL_R="$R"; rm -f "$LINK"; ln -s "$REAL_R" "$LINK" 2>/dev/null || { echo "VBW: plugin root link failed" >&2; exit 1; }; STAMP="/tmp/.vbw-phase-detect-stamp-${SESSION_KEY}.txt"; : > "$STAMP"; bash "$LINK/scripts/phase-detect.sh" > "/tmp/.vbw-phase-detect-${SESSION_KEY}.txt" 2>/dev/null || echo "phase_detect_error=true" > "/tmp/.vbw-phase-detect-${SESSION_KEY}.txt"; echo "$LINK"`
Current state:
!`head -40 .vbw-planning/STATE.md 2>/dev/null || echo "No state found"`
Config: Pre-injected by SessionStart hook.
Phase directories:
!`ls .vbw-planning/phases/ 2>/dev/null || echo "No phases directory"`
Phase state:
!`SESSION_KEY="${CLAUDE_SESSION_ID:-default}"; L="/tmp/.vbw-plugin-root-link-${SESSION_KEY}"; P="/tmp/.vbw-phase-detect-${SESSION_KEY}.txt"; S="/tmp/.vbw-phase-detect-stamp-${SESSION_KEY}.txt"; PD=""; [ -f "$P" ] && PD=$(cat "$P"); if [ -z "$PD" ] || [ "$PD" = "phase_detect_error=true" ] || [ -L "$L" ]; then i=0; while [ ! -L "$L" ] && [ $i -lt 20 ]; do sleep 0.1; i=$((i+1)); done; S_M=0; P_M=0; [ -f "$S" ] && S_M=$(stat -c %Y "$S" 2>/dev/null || stat -f %m "$S" 2>/dev/null || echo 0); [ -f "$P" ] && P_M=$(stat -c %Y "$P" 2>/dev/null || stat -f %m "$P" 2>/dev/null || echo 0); if [ -L "$L" ] && [ -f "$L/scripts/phase-detect.sh" ] && { [ -z "$PD" ] || [ "$PD" = "phase_detect_error=true" ] || [ "$P_M" -lt "$S_M" ]; }; then PD=$(bash "$L/scripts/phase-detect.sh" 2>/dev/null) || PD=""; fi; fi; if [ -n "$(printf '%s' "$PD" | tr -d '[:space:]')" ] && [ "$PD" != "phase_detect_error=true" ]; then printf '%s' "$PD"; else echo "phase_detect_error=true"; fi`
!L="/tmp/.vbw-plugin-root-link-${CLAUDE_SESSION_ID:-default}"; i=0; while [ ! -L "$L" ] && [ $i -lt 20 ]; do sleep 0.1; i=$((i+1)); done; bash "$L/scripts/suggest-compact.sh" verify 2>/dev/null || true
Pre-computed verify context (PLAN/SUMMARY aggregation):
!`SESSION_KEY="${CLAUDE_SESSION_ID:-default}"; L="/tmp/.vbw-plugin-root-link-${SESSION_KEY}"; P="/tmp/.vbw-phase-detect-${SESSION_KEY}.txt"; S="/tmp/.vbw-phase-detect-stamp-${SESSION_KEY}.txt"; PD=""; [ -f "$P" ] && PD=$(cat "$P"); if [ -z "$PD" ] || [ "$PD" = "phase_detect_error=true" ] || [ -L "$L" ]; then i=0; while [ ! -L "$L" ] && [ $i -lt 20 ]; do sleep 0.1; i=$((i+1)); done; S_M=0; P_M=0; [ -f "$S" ] && S_M=$(stat -c %Y "$S" 2>/dev/null || stat -f %m "$S" 2>/dev/null || echo 0); [ -f "$P" ] && P_M=$(stat -c %Y "$P" 2>/dev/null || stat -f %m "$P" 2>/dev/null || echo 0); if [ -L "$L" ] && [ -f "$L/scripts/phase-detect.sh" ] && { [ -z "$PD" ] || [ "$PD" = "phase_detect_error=true" ] || [ "$P_M" -lt "$S_M" ]; }; then PD=$(bash "$L/scripts/phase-detect.sh" 2>/dev/null) || PD=""; fi; fi; if [ -z "$(printf '%s' "$PD" | tr -d '[:space:]')" ] || [ "$PD" = "phase_detect_error=true" ]; then echo "verify_context=unavailable"; else SLUG=$(printf '%s' "$PD" | grep '^next_phase_slug=' | head -1 | cut -d= -f2); FU_SLUG=$(printf '%s' "$PD" | grep '^first_unverified_slug=' | head -1 | cut -d= -f2); TARGET="${FU_SLUG:-$SLUG}"; PDIR=".vbw-planning/phases/$TARGET"; if [ -n "$TARGET" ] && [ -d "$PDIR" ] && [ -f "$L/scripts/compile-verify-context.sh" ]; then echo "verify_target_slug=$TARGET"; bash "$L/scripts/compile-verify-context.sh" "$PDIR" 2>/dev/null || echo "verify_context_error=true"; else echo "verify_context=unavailable"; fi; fi`
Pre-computed UAT resume metadata:
!`SESSION_KEY="${CLAUDE_SESSION_ID:-default}"; L="/tmp/.vbw-plugin-root-link-${SESSION_KEY}"; P="/tmp/.vbw-phase-detect-${SESSION_KEY}.txt"; S="/tmp/.vbw-phase-detect-stamp-${SESSION_KEY}.txt"; PD=""; [ -f "$P" ] && PD=$(cat "$P"); if [ -z "$PD" ] || [ "$PD" = "phase_detect_error=true" ] || [ -L "$L" ]; then i=0; while [ ! -L "$L" ] && [ $i -lt 20 ]; do sleep 0.1; i=$((i+1)); done; S_M=0; P_M=0; [ -f "$S" ] && S_M=$(stat -c %Y "$S" 2>/dev/null || stat -f %m "$S" 2>/dev/null || echo 0); [ -f "$P" ] && P_M=$(stat -c %Y "$P" 2>/dev/null || stat -f %m "$P" 2>/dev/null || echo 0); if [ -L "$L" ] && [ -f "$L/scripts/phase-detect.sh" ] && { [ -z "$PD" ] || [ "$PD" = "phase_detect_error=true" ] || [ "$P_M" -lt "$S_M" ]; }; then PD=$(bash "$L/scripts/phase-detect.sh" 2>/dev/null) || PD=""; fi; fi; if [ -z "$(printf '%s' "$PD" | tr -d '[:space:]')" ] || [ "$PD" = "phase_detect_error=true" ]; then echo "uat_resume=unavailable"; else SLUG=$(printf '%s' "$PD" | grep '^next_phase_slug=' | head -1 | cut -d= -f2); FU_SLUG=$(printf '%s' "$PD" | grep '^first_unverified_slug=' | head -1 | cut -d= -f2); TARGET="${FU_SLUG:-$SLUG}"; PDIR=".vbw-planning/phases/$TARGET"; if [ -n "$TARGET" ] && [ -d "$PDIR" ] && [ -f "$L/scripts/extract-uat-resume.sh" ]; then echo "uat_resume_target_slug=$TARGET"; bash "$L/scripts/extract-uat-resume.sh" "$PDIR" 2>/dev/null || echo "uat_resume=error"; else echo "uat_resume=unavailable"; fi; fi`
misnamed_plans=true, normalize all phase directories before proceeding:
NORM_SCRIPT="/tmp/.vbw-plugin-root-link-${CLAUDE_SESSION_ID:-default}/scripts/normalize-plan-filenames.sh"
if [ -f "$NORM_SCRIPT" ]; then
for pdir in .vbw-planning/phases/*/; do
[ -d "$pdir" ] && bash "$NORM_SCRIPT" "$pdir"
done
fi
Display: "⚠ Renamed misnamed plan files to {NN}-PLAN.md convention."
Then re-run phase-detect.sh to refresh state (filenames changed):
bash "/tmp/.vbw-plugin-root-link-${CLAUDE_SESSION_ID:-default}/scripts/phase-detect.sh" > "/tmp/.vbw-phase-detect-${CLAUDE_SESSION_ID:-default}.txt"
Use the refreshed phase-detect output for all subsequent guard checks and steps. Also regenerate pre-computed verify context and UAT resume metadata for the target phase after auto-detection (Step 1).next_phase and next_phase_slug for the target phase.
next_phase_state=needs_reverification: use next_phase directly — this is the phase that just completed remediation and needs re-verification.first_unverified_phase is set: use that phase directly — this is the first fully-built phase without a terminal UAT.*-SUMMARY.md but no canonical *-UAT.md (exclude *-SOURCE-UAT.md copies)./vbw:verify {NN}".vbw-planning/phases/ for phase directoriesmisnamed_plans=true: re-run compile-verify-context.sh and extract-uat-resume.sh for the resolved target phase dir, since pre-computed blocks used stale filenames:
PDIR=".vbw-planning/phases/{target-slug}"
bash "/tmp/.vbw-plugin-root-link-${CLAUDE_SESSION_ID:-default}/scripts/compile-verify-context.sh" "$PDIR"
bash "/tmp/.vbw-plugin-root-link-${CLAUDE_SESSION_ID:-default}/scripts/extract-uat-resume.sh" "$PDIR"
Use the refreshed output in place of the pre-computed blocks from Context.*-SUMMARY.md or *-PLAN.md files.verify_target_slug, ignore the pre-computed context (it was generated for the auto-detected phase). Read PLAN/SUMMARY files from the user-specified phase directory instead.next_phase_state=needs_reverification (from Context above):
prepare-reverification.sh {phase-dir} to archive the old UAT and reset remediation stageskipped=already_archived, display: UAT already archived. Starting fresh re-verification.Archived previous UAT → {round_file}. Starting fresh re-verification.uat_resume=none: no existing UAT session — proceed to Step 4 (generate tests)uat_resume=all_done uat_completed=N uat_total=N: all tests already have results — display the summary, STOPuat_resume=<test-id> uat_completed=N uat_total=N: resume at <test-id>. Display: Resuming UAT session -- {completed}/{total} tests done. Read the UAT.md once to load checkpoint text, then jump to the CHECKPOINT loop at the resume point.For each plan in the pre-computed verify context block:
what_was_built, files_modified, and must_haves data. Do NOT read SUMMARY.md or PLAN.md files.P{plan}-T{NN} (e.g., P01-T1, P01-T2, P02-T1)UAT tests must be things only a human can judge. Good examples:
NEVER generate tests that ask the user to run automated checks. These belong in the QA phase, not UAT:
If a plan only contains backend/test/script changes with no user-facing behavior, generate a scenario that asks the human to verify the effect is visible (e.g., "confirm the migration preview no longer shows phantom entries") rather than asking them to run the tests themselves.
Write the initial {phase}-UAT.md in the phase directory using the templates/UAT.md format:
This is a conversational loop. Present ONE test, then STOP and wait for the user to respond. Do NOT present multiple tests at once. Do NOT skip ahead. Do NOT end the session after presenting a test.
For the FIRST test without a result, display a CHECKPOINT followed by AskUserQuestion:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CHECKPOINT {NN}/{total} — {plan-id}: {plan-title}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
{scenario description}
Then immediately use AskUserQuestion:
question: "Expected: {expected result}"
header: "UAT"
multiSelect: false
options:
- label: "Pass"
description: "Behavior matches expected result"
- label: "Skip"
description: "Cannot test right now — skip this checkpoint"
The tool automatically provides a freeform "Other" option for the user to describe issues.
After response: process (Step 5), persist (Step 7), then present the NEXT test. Repeat until all tests are done, then go to Step 8.
Map the AskUserQuestion response:
"Pass" selected: Record as passed. However, if the user's response also mentions a separate bug/issue (e.g., "Pass, but I noticed X is broken"), record the test as passed AND capture the separate observation as a discovered issue (see Step 6a).
"Skip" selected: Record as skipped. However, if the user selected "Skip" but also typed additional text describing a bug/issue (e.g., the response body contains "but the sidebar is broken" alongside the Skip selection), record the test as skipped AND capture the additional text as a discovered issue (see Step 6a). The additional text is the response content beyond the option selection itself.
Freeform text (via "Other"): Apply case-insensitive matching in this order after normalization.
Normalization (required first):
can’t == can't).Word-boundary rule: Match intent keywords as whole words only — a keyword matches when it is surrounded by whitespace, punctuation, or string boundaries (equivalent to regex \b). Examples: "pass" matches in "pass, but..." and "Pass." but NOT in "passport"; "works" matches in "it works" but NOT in "worksmanship"; "good" matches in "looks good" but NOT in "goodness".
Idiomatic-positive exceptions: These should count as pass-intent (not issues): not bad, can't complain, cant complain, cannot complain.
Negation guard (expanded scope): Before classifying as pass-intent, detect negation in the same clause even when not immediately adjacent. If a negation term appears up to a few words before pass-intent (or in patterns like "I don't think it works"), treat as issue (Step 6), unless the text matches an idiomatic-positive exception above. Negation terms: not, don't, doesn't, didn't, isn't, wasn't, no, never, neither, nor, hardly, barely, cannot, can't, won't, wouldn't, shouldn't. Examples: "not good, still broken" → issue; "I don't think it works" → issue; "it works" → pass.
Observation extraction guard: Only create a discovered issue when text after a separator includes a defect/issue signal (e.g., broken, bug, error, wrong, missing, not working, fails, crash, exception, regression, problem). If trailing text is neutral/positive only (e.g., "pass: looks great"), do NOT create a discovered issue.
Dual-intent tie-break (pass + skip in one response):
Evaluate in this order:
The user's response text IS the issue description. Infer severity from keywords (never ask the user):
| Keywords | Severity |
|---|---|
| crash, broken, error, doesn't work, fails, exception | critical |
| wrong, incorrect, missing, not working, bug | major |
| minor, cosmetic, nitpick, small, typo, polish | minor |
| (no keyword match) | major |
Record: description, inferred severity.
Display:
Issue recorded (severity: {level}). Final next-step routing shown at UAT summary.
When a user passes or skips a test but also mentions a separate bug, issue, or observation unrelated to the test's expected behavior, capture it as a discovered issue.
Assign a discovered-issue ID: D{NN} (D01, D02, ...) — sequential across the UAT session. On resumed sessions: scan existing D{NN} entries in the UAT.md to find the highest existing number, then continue from max+1 (e.g., if D01 and D02 exist, the next discovered issue is D03).
Infer severity using the same keyword table from Step 6. Infer category from context:
Append a new test entry to the UAT.md ## Tests section:
### D{NN}: {short-title}
- **Plan:** (discovered during {test-id})
- **Scenario:** User observation during UAT
- **Expected:** (not applicable — discovered issue)
- **Result:** issue
- **Issue:**
- Description: {observation text}
- Severity: {inferred severity}
Increment total_tests and issues in frontmatter. This ensures discovered issues flow into UAT remediation alongside test failures.
Display:
Discovered issue D{NN} recorded (severity: {level}).
{phase}-UAT.md with the result for this test✓ {completed}/{total} tests{phase}-UAT.md frontmatter: status (complete or issues_found), completed date, final counts━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase {NN}: {name} — UAT Complete
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Result: {✓ PASS | ✗ ISSUES FOUND}
Passed: {N}
Skipped: {N}
Issues: {N}
Report: {path to UAT.md}
Discovered Issues in summary: If any discovered issues (D{NN} entries) were recorded during the session, list them after the result box so the user sees them at a glance:
Discovered Issues:
⚠ D01: {short-title} (severity: {level})
⚠ D02: {short-title} (severity: {level})
These are already recorded in the UAT.md and will flow into remediation alongside test failures. If no discovered issues: omit the section.
critical or major:
Suggest /vbw:vibe to continue UAT remediation directly from {phase}-UAT.mdminor:
Suggest /vbw:fix to address recorded issues.Planning artifact boundary commit (conditional):
PG_SCRIPT="/tmp/.vbw-plugin-root-link-${CLAUDE_SESSION_ID:-default}/scripts/planning-git.sh"
if [ -f "$PG_SCRIPT" ]; then
bash "$PG_SCRIPT" commit-boundary "verify phase {NN}" .vbw-planning/config.json
else
echo "VBW: planning-git.sh unavailable; skipping planning git boundary commit" >&2
fi
planning_tracking=commit: commits .vbw-planning/ + CLAUDE.md when changed (includes UAT report)planning_tracking=manual|ignore: no-opauto_push=always: push happens inside the boundary commit command when upstream existsRun bash "$(echo /tmp/.vbw-plugin-root-link-${CLAUDE_SESSION_ID:-default})/scripts/suggest-next.sh" verify {result} {phase} and display.
/verifyComprehensive verification with parallel test agents. Use when verifying implementations or validating changes.
/verifyComprehensive verification with parallel test agents. Use when verifying implementations or validating changes.