Help us improve
Share bugs, ideas, or general feedback.
From backpressured
Verifies a feature works by running the real app, not by trusting green tests. Use when a backpressured loop reaches the before-done gate (Phase 3) and needs to confirm new behavior end-to-end — curling live API endpoints and driving a real browser via Playwright — including the cheap unhappy paths, before the task is called done.
npx claudepluginhub lucasfcosta/backpressured --plugin backpressuredHow this skill is triggered — by the user, by Claude, or both
Slash command
/backpressured:manual-testingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**You are the machine that uses the feature the way a client or a person will, so a human isn't the first to find out it doesn't actually work.** Automated green is necessary, not sufficient: tests exercise the code, not the *running system*. Wiring, integration, env, and the error paths break in ways a passing unit test never sees. This is the "run it for real" gate.
Guides manual testing of app features with checklists, edge case identification, cross-browser and mobile verification, and structured QA workflows before deployment.
Verifies implemented features against specs or test plans via manual QA on live apps. Accepts Figma mockups, PRDs, acceptance criteria; generates test plans from specs if none provided.
Visually verify implemented features work correctly before marking complete. Use when testing UI changes, verifying web features, or checking user flows work in the browser.
Share bugs, ideas, or general feedback.
You are the machine that uses the feature the way a client or a person will, so a human isn't the first to find out it doesn't actually work. Automated green is necessary, not sufficient: tests exercise the code, not the running system. Wiring, integration, env, and the error paths break in ways a passing unit test never sees. This is the "run it for real" gate.
Core principle: you only pass this on what you personally observed in the running app. Not "tests pass, so it works." Not "I curled the happy path." You booted it, you drove it, you saw each acceptance criterion — and the cheap failure cases — behave. Evidence, not inference.
backpressured loop's Phase 3, after automated checks are green, before the task is done. Always run it here — once before done, every time, even when the suite (including integration and end-to-end tests) is green. Automated tests are never a substitute (see Boundaries). The only thing that removes this gate is an explicit manual-testing opt-out in BACKPRESSURE.md's Skip section. It's slower than tests, so don't run it every iteration — mid-loop, run it only when something warrants it; but the before-done gate is not optional.Not for: replacing automated tests or per-iteration checks (this is the end gate, not every patch). Gate by what exists: an API → curl it; a UI → drive the browser; genuinely nothing runnable (or an explicit BACKPRESSURE.md opt-out) → skip with a note saying why. A green test suite is not a reason to skip.
Discover how to run it yourself — package.json scripts, Makefile, docker-compose, README, .env.example (or use project-specific run skills if they exist). Then actually boot it: start it in the background, tail the logs until it's genuinely healthy ("listening on :PORT", no startup errors), and note the URLs.
Set up the minimum real dependencies to exercise the feature: bring up the DB, run migrations, seed at least one real row, and obtain a valid auth token and a real id if the feature needs them. If it genuinely can't boot (missing secret, broken dep), report BLOCKED — do not fake it or skip to "tests pass."
Exercise every acceptance criterion against the live app, not the test suite.
curl -i (status line visible) each endpoint. Check the body shape, headers, and content the consumer depends on — not just a 2xx. (CSV is actually CSV with the right Content-Type, not an empty body or a stack trace.)The failure cases are where manual testing earns its keep over automated tests — and they're trivial by hand. Exercise the ones in the acceptance criteria and the obvious ones around them: error responses (400 / 401 / 403 / 404), empty or invalid input, missing fields, the permission-denied case. Confirm the app returns the status and the user-facing behavior the feature promises (the inline error renders; the toast shows), not a 500 or a silent no-op.
Two different things — don't conflate them:
disabled toggles, the inline error text renders, the toast appears. Capture screenshots / DOM observations of these as evidence regardless of whether a design reference exists — this is functional verification, not styling."Tested manually, works" is not a passing check. Record the actual curl commands and their status/bodies, what you clicked in the browser, and what the screenshots showed. Passing = you personally observed, in the running app, every acceptance criterion and the cheap unhappy paths behaving correctly. Anything wrong is unresolved work: fix it and re-run, or stop and name the blocker — never wave it through or call it done off green tests.
reviewer subagent: the reviewer reads the diff; you run it.| Rationalization | Reality |
|---|---|
| "Tests pass, so it works" | Tests exercise the code, not the running system. Wiring, env, and integration fail outside them. Run it. |
| "I curled the happy path, ship it" | You didn't drive the UI or hit a single error path. The happy path is the case you already knew worked. |
| "Manual testing is slow, skip it this once" | It's the before-done gate, not an optional extra. Run it every time, unless BACKPRESSURE.md's Skip section opts out — an ad-hoc "just this once" is not an opt-out. |
| "Integration/E2E tests already drive the app, so this is covered" | They run automated, against the code, and share automated tests' blind spots. This gate still runs unless BACKPRESSURE.md explicitly opts out of manual testing. |
| "The disabled-while-submitting state is too fast to see" | Throttle the network and observe it. "Too fast to see" is not "verified". |
| "No design reference, so I can't check the UI" | You can still confirm it renders and works. Take sanity screenshots; look for obvious breakage. |
| "I tested it manually, it's fine" | With no recorded commands/responses/screenshots, that's an assertion, not a check. |
BACKPRESSURE.md's Skip section).