From swing-skills
Conducts devil's advocate stress-testing on code, architecture, PRs, and decisions to surface hidden flaws via structured adversarial analysis. For high-stakes reviews only.
npx claudepluginhub thestack-ai/swing-skillsThis skill is limited to using the following tools:
Structured Devil's Advocate analysis that surfaces hidden flaws, edge cases, and blind spots.
Performs devil's advocate stress-testing on code, architecture, PRs, and decisions to surface hidden flaws, edge cases, and blind spots in high-stakes reviews.
Challenges assumptions, surfaces risks, and identifies failure modes in PRs, designs, technical plans using a structured review checklist.
Performs adversarial review of proposed work as skeptical senior engineer, poking holes in assumptions, risks, and complexity. Invoke only via /adversarial-review.
Share bugs, ideas, or general feedback.
Structured Devil's Advocate analysis that surfaces hidden flaws, edge cases, and blind spots.
If the subject under review is unclear or too broad, ask one clarifying question before proceeding. Do not review a vague target. Examples of ambiguous input that should trigger a clarification question:
One question. Get the answer. Then proceed.
Before any criticism, articulate:
This ensures the subsequent critique is intellectually honest, not reflexive opposition.
Apply three independent attack vectors simultaneously:
Scope: Does the REASONING hold? Examine premises, conclusions, logical flow.
Scope: Does it SURVIVE reality? Test against real-world conditions.
Scope: Is the STRUCTURE sound? Examine architecture and design.
Classify every finding:
| Severity | Symbol | Meaning | Action Required |
|---|---|---|---|
| Critical | 🔴 | Will cause production issues, security vulnerabilities, or data loss | Must fix before merge/deploy |
| Major | 🟠 | Significant risk, performance issue, or maintainability problem | Should fix, blocking for merge |
| Minor | 🟡 | Code smell, style issue, or small optimization opportunity | Consider fixing, non-blocking |
| Note | 💡 | Observation, alternative approach, or future consideration | Informational only |
For each Critical and Major finding, provide:
## Adversarial Review: [Subject]
### Steel-Man
> [Why this approach makes sense — strongest justification]
### Findings
#### 🔴 Critical: [Title]
**Vector:** [Logical Soundness / Edge Case / Structural Integrity]
**What:** [Description]
**Impact:** [Concrete consequence]
**Fix:** [Proposed solution]
**Trade-off:** [Cost of the fix]
#### 🟠 Major: [Title]
...
#### 🟡 Minor: [Title]
...
#### 💡 Note: [Title]
...
### Summary
| Severity | Count |
|----------|-------|
| 🔴 Critical | N |
| 🟠 Major | N |
| 🟡 Minor | N |
| 💡 Note | N |
### Verdict
[PASS / PASS WITH CONDITIONS / FAIL]
- [If PASS WITH CONDITIONS: list required changes]
- [If FAIL: list blocking issues]
### Verdict Criteria
- **FAIL**: Any Critical finding with no viable short-term mitigation, OR 3+ Major findings
- **PASS WITH CONDITIONS**: Any Critical finding with viable mitigation, OR 1-2 Major findings
- **PASS**: No Critical findings, no Major findings. Minor and Notes only.
These thresholds ensure consistent verdicts across invocations.
### Hidden Assumptions Exposed
- [Assumption 1 that the current approach relies on]
- [Assumption 2 that could invalidate the approach if wrong]
## Adversarial Review: User Auth Module
### Steel-Man
> It works.
### Findings
#### 🟡 Minor: Variable naming
**Vector:** Structural Integrity
**What:** Some variables could be named better.
**Impact:** Readability.
**Fix:** Rename them.
**Trade-off:** Time.
#### 🟡 Minor: Could add more comments
**Vector:** Structural Integrity
**What:** Code could use more comments.
**Impact:** Future developers might be confused.
**Fix:** Add comments.
**Trade-off:** None.
#### 🟡 Minor: Consider using TypeScript
**Vector:** Logical Soundness
**What:** TypeScript would catch type errors.
**Impact:** Fewer runtime bugs.
**Fix:** Migrate to TypeScript.
**Trade-off:** Migration effort.
### Verdict: PASS
Why this is bad:
## Adversarial Review: User Auth Module
### Steel-Man
> JWT-based stateless auth was chosen to avoid session storage overhead and
> enable horizontal scaling. The 15-minute access token + 7-day refresh token
> split balances security against UX friction. Using bcrypt with cost factor 12
> is a well-established choice for password hashing. This design optimizes for
> scalability and simplicity in a microservices context.
### Findings
#### 🔴 Critical: No refresh token rotation enables silent session hijacking
**Vector:** Edge Case
**What:** Refresh tokens are long-lived (7 days) and not rotated on use.
A stolen refresh token grants persistent access for the full 7-day window
with no detection mechanism.
**Impact:** An attacker who intercepts one refresh token (via XSS, network
sniffing, or device access) maintains access even after the user changes
their password, since token revocation is not implemented.
**Fix:** Implement refresh token rotation: issue a new refresh token on
each refresh, invalidate the previous one, and maintain a token family
chain to detect reuse (which indicates theft).
**Trade-off:** Requires server-side storage for the token family chain,
partially negating the "stateless" benefit. Adds ~50ms per refresh request.
#### 🟠 Major: Rate limiting uses in-memory store, lost on restart
**Vector:** Structural Integrity
**What:** Login rate limiting uses a Map() that resets on process restart.
**Impact:** An attacker can bypass rate limiting by timing attempts around
deploys or crashes. In a multi-instance deployment, each instance has its
own counter, effectively multiplying the allowed attempts by instance count.
**Fix:** Move rate limit state to Redis with TTL-based expiry.
**Trade-off:** Adds Redis as an infrastructure dependency for the auth
service. ~2ms latency per rate limit check.
### Verdict: PASS WITH CONDITIONS
- Must implement refresh token rotation before production deploy
- Should migrate rate limiting to shared store before scaling to >1 instance
Why this is good:
When reviewing code (files, PRs, diffs):
When reviewing architecture/design decisions:
When reviewing pull requests:
swing-research)swing-options)swing-research → swing-review