Use when changes span multiple components, services, or layers and unit/integration tests alone cannot confirm the full flow works correctly
From menpx claudepluginhub baleen37/bstack --plugin bstackThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Implements structured self-debugging workflow for AI agent failures: capture errors, diagnose patterns like loops or context overflow, apply contained recoveries, and generate introspection reports.
E2e verification confirms that a complete user-visible flow works across all touched components. Unit and integration tests verify parts; e2e verification confirms they connect correctly.
Core principle: If your changes cross a service boundary, data layer, or user-facing interface, verify the full path — not just the individual pieces.
digraph decide {
"Changes touch multiple components?" [shape=diamond];
"Service boundaries changed?" [shape=diamond];
"Data flow altered?" [shape=diamond];
"Skip e2e" [shape=box];
"Run e2e verification" [shape=box];
"Changes touch multiple components?" -> "Run e2e verification" [label="yes"];
"Changes touch multiple components?" -> "Service boundaries changed?" [label="no"];
"Service boundaries changed?" -> "Run e2e verification" [label="yes"];
"Service boundaries changed?" -> "Data flow altered?" [label="no"];
"Data flow altered?" -> "Run e2e verification" [label="yes"];
"Data flow altered?" -> "Skip e2e" [label="no"];
}
Needs e2e:
Skip e2e:
Map the complete path your change affects, from entry point to final side effect.
Entry point → [Component A] → [Component B] → ... → Final side effect
Example:
POST /api/order → cart service → inventory check → payment API → order DB → confirmation email
What to trace:
For EACH step in the flow, state the concrete expected outcome.
| Step | Action | Pass Criteria |
|---|---|---|
| 1 | POST /api/order with valid cart | 200, order ID returned |
| 2 | Check inventory | Stock decremented by ordered quantity |
| 3 | Payment API call | Charge created, transaction ID stored |
| 4 | DB state | Order record with status=confirmed, correct amounts |
| 5 | Confirmation sent with correct order details |
Bad criteria: "Check that it works" / "Verify the response" / "Make sure email sends"
Good criteria: "Response status 200 with JSON containing order_id string" / "Email contains order #X and total $Y"
Verify bottom-up: infrastructure → data → API → flow → side effects.
Stop at the first failure. Fix it before proceeding — downstream steps depend on upstream ones.
## E2E Verification: [flow name]
### Scope
[What was changed and why e2e is needed]
### Results
| Step | Status | Evidence |
|------|--------|----------|
| Infrastructure | PASS | Services responding on expected ports |
| DB migration | PASS | New column exists, default values correct |
| API contract | FAIL | Payment service returns `txn_id`, order service expects `transaction_id` |
### Blocking Issue
[Description of first failure, root cause if known]
Not every step is directly verifiable from your environment. Be explicit about what you CAN and CANNOT check.
| Can verify | How |
|---|---|
| API responses | curl, httpie, test scripts |
| DB state | SQL queries, ORM console |
| Log output | grep logs, structured log queries |
| File output | Read generated files |
| Local email | Mailhog, Mailtrap, or similar |
| Cannot verify (delegate) | Who |
|---|---|
| Production email delivery | Human or monitoring |
| Browser UI rendering | Human or Playwright/Cypress |
| Mobile app behavior | Human or mobile test framework |
| Third-party webhook reception | Third-party dashboard or logs |
State what you verified and what you couldn't. Never claim "e2e verified" when steps were skipped.
Checking only the happy path. The happy path usually works. It's the error paths and edge cases that break across service boundaries.
Vague pass criteria. "It works" is not a criterion. State the specific expected output, status code, DB state, or side effect.
Testing in wrong order. If the DB migration is broken, API tests will fail with confusing errors. Go bottom-up.
Skipping the report. If you don't report what was and wasn't verified, your partner can't judge coverage.
Over-verifying. A typo fix doesn't need e2e. Use the decision flowchart. Unnecessary e2e wastes time and teaches you to skip it when it matters.