Forces architectural reassessment after 2+ failed fixes, whack-a-mole symptoms, or cascading errors. Inventories attempts, finds shared systems, researches via Perplexity. Proactive for infra config.
npx claudepluginhub anombyte93/atlas-session-lifecycleThis skill uses the workspace's default tool permissions.
**Core principle:** If you've tried 2+ fixes for the same class of problem, you're probably patching symptoms. Stop. Zoom out. Research the architecture.
Use when investigating bugs, errors, failing tests, or unexpected behavior. Enforces systematic root-cause investigation BEFORE attempting any fix. Invoke this skill before writing any fix code.
Guides systematic debugging of broken features, user errors, failed deployments, and tests: reproduce bugs, gather diagnostics via git/bash/logs, diagnose root causes before fixes.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Share bugs, ideas, or general feedback.
Core principle: If you've tried 2+ fixes for the same class of problem, you're probably patching symptoms. Stop. Zoom out. Research the architecture.
digraph stepback {
"Fix attempt failed?" [shape=diamond];
"Is this the 2nd+ attempt?" [shape=diamond];
"Same root system?" [shape=diamond];
"Keep debugging" [shape=box];
"STOP — Run /stepback" [shape=box, style=bold];
"Fix attempt failed?" -> "Is this the 2nd+ attempt?" [label="yes"];
"Fix attempt failed?" -> "Keep debugging" [label="no, first try"];
"Is this the 2nd+ attempt?" -> "Same root system?" [label="yes"];
"Is this the 2nd+ attempt?" -> "Keep debugging" [label="no"];
"Same root system?" -> "STOP — Run /stepback" [label="yes"];
"Same root system?" -> "Keep debugging" [label="different systems"];
}
Symptoms that demand /stepback:
Proactive triggers (don't wait for failure):
next.config.js, vercel.json, Dockerfile, CI/CDList every fix attempt so far in this session. For each:
Present this to the user as a table:
| # | Fix | Symptom | Result | Assumption |
|---|-----|---------|--------|------------|
| 1 | Added route to middleware whitelist | 405 on webhook | Still 405 | Middleware was blocking |
| 2 | Replaced parseBody with manual HMAC | 500 crash | Still 500 | parseBody incompatible |
| 3 | ... | ... | ... | ... |
Ask: "What system do ALL of these symptoms share?"
Don't look at each symptom individually. Look for:
Spawn a research agent (Perplexity) with THREE queries:
This is mandatory. No skipping research because "I think I know."
Instead of testing your specific broken feature, test whether the ENTIRE CLASS of features works:
Before: "Does /api/revalidate work?"
After: "Does ANY API route work on production?"
If the broader test fails, you've found a systemic issue — fix THAT, not the individual symptom.
Show the user:
**Symptom-level fix:** [what you've been doing]
**Architecture-level fix:** [what research suggests]
**Effort comparison:** [patch N symptoms vs fix 1 root cause]
**Recommendation:** [which approach and why]
Let the user decide. Don't assume redesign is always better — sometimes patching IS correct.
| Signal | What It Means |
|---|---|
| "Let me just try one more thing" | You're guessing, not diagnosing |
| "This fix should definitely work" | You said that last time |
| "It works locally" | Local ≠ production. Check the deployment pipeline |
| "I'll add this to the whitelist/allowlist" | You're growing a list instead of questioning the filter |
| "The code is correct, something else is wrong" | Check if the code is even RUNNING |
| 3+ commits with "fix:" in a row | Pattern detected — step back |
When running /stepback, pretend you're a new consultant reviewing this system for the first time. Ask:
The answer is almost always in the DELTA between standard and custom.
Session 2026-02-13 — Atlas Website webhook
5 fix attempts over 2 hours: middleware whitelist, parseBody replacement, CSP headers, dead component cleanup, deployment protection check. All were real issues but none were THE issue.
The outsider test revealed: outputFileTracingRoot in next.config.js was set to a hardcoded local path (/home/anombyte/Atlas/Atlas_Website). On Vercel's build server, this path doesn't exist, so Next.js silently generated ZERO serverless functions. No API routes worked — not just the webhook.
One line removed. Everything worked.
The 5 "fixes" were real improvements (cleaner middleware, better CSP, removed dead code) — but they would have taken 10 minutes as planned cleanup, not 2 hours of confused debugging.