Adversarial review - spawn agents that attack your plan/code
Spawns adversarial agents that attack your plan or code to find flaws.
/plugin marketplace add cyberbloke9/pmp-gywd/plugin install pmp-gywd@pmp-gywd[plan-path|code-path|'current'] [--mode light|standard|aggressive]gywd/Current AI cooperates with you. That's the problem. Cooperation without conflict produces groupthink and blind spots.
This command spawns agents with different adversarial roles:
Solutions that survive this gauntlet are actually robust. </objective>
<philosophy> "Strong opinions, weakly held" requires someone to challenge those opinions.The best human teams have:
AI teams should too. </philosophy>
<agents> ## Adversarial Agent RolesMission: Find logical flaws, gaps, and weaknesses.
Looks for:
Output style:
## Critic Report
### Critical Issues
1. **Task 3 has no rollback plan**
- If database migration fails, system is in inconsistent state
- Recommendation: Add compensating transaction
### Concerns
1. **Caching strategy assumes low write frequency**
- Evidence: Cache TTL is 1 hour
- Risk: Stale data if writes increase
- Question: What's the expected write pattern?
### Minor Issues
1. **No rate limiting on public endpoint**
Mission: Argue for the alternatives you rejected.
Takes the opposite position:
Output style:
## Devil's Advocate Report
### Alternative: Use PostgreSQL Instead of MongoDB
You chose MongoDB for "flexibility." But consider:
**Arguments for PostgreSQL:**
1. Your data IS relational (users → orders → items)
2. ACID transactions would prevent the race condition in Task 4
3. Team has more PostgreSQL experience
4. JSON columns provide flexibility without sacrificing joins
**Counter to your reasoning:**
- "Schema flexibility" → You'll end up with implicit schema anyway
- "Easier scaling" → You're not at scale that requires sharding
**Verdict:** This decision should be revisited.
Confidence: 72%
Mission: Try to break the implementation.
Attack vectors:
Output style:
## Red Team Report
### Attack: SQL Injection via Order ID
**Vector:** GET /api/orders/{id}
**Payload:** `1; DROP TABLE orders;--`
**Result:** Query executed without sanitization
**Severity:** CRITICAL
### Attack: Rate Limit Bypass
**Vector:** Rotate IP addresses
**Result:** Rate limiting is per-IP only
**Severity:** HIGH
### Attack: Memory Exhaustion
**Vector:** Upload 10GB file to /api/upload
**Result:** No file size limit, OOM crash
**Severity:** HIGH
Mission: Generate edge cases and weird scenarios.
Generates:
Output style:
## Chaos Report
### Edge Cases for User Registration
| Input | Expected | Potential Issue |
|-------|----------|-----------------|
| Email: "" | Reject | What error message? |
| Email: "a@b" | ? | Technically valid TLD |
| Name: "" | ? | Zero-width space only |
| Name: "Robert'); DROP TABLE users;--" | Sanitize | SQL injection |
| Password: 1000 chars | ? | Length limit needed |
| Timezone: "Mars/Olympus" | ? | Invalid timezone handling |
| DOB: "2099-01-01" | ? | Future date validation |
| DOB: "1800-01-01" | ? | Suspiciously old |
### Concurrency Scenarios
| Scenario | Risk |
|----------|------|
| Same user registers twice simultaneously | Duplicate accounts? |
| Email verification during password reset | Token confusion? |
| Session delete during active request | Auth state corruption? |
Mission: Question the requirements and assumptions.
Challenges:
Output style:
## Skeptic Report
### Questioning: Real-time Notifications Feature
**Assumption challenged:** "Users need instant notifications"
**Questions:**
1. What data shows users want real-time vs. batched?
2. What's the cost of WebSocket infrastructure?
3. Would email digest achieve 80% of the value at 20% cost?
**Evidence gap:**
- No user research cited in requirements
- No A/B test proposed
- No success metrics defined
**Recommendation:** Validate assumption before building.
Suggest: User interviews or fake-door test first.
</agents>
<modes>
## Challenge Modes
Parse target:
Load context:
Spawn agents in parallel:
Task: Critic Agent analyzing plan...
Task: Devil's Advocate Agent analyzing decisions...
Task: Red Team Agent analyzing security...
Task: Chaos Agent generating edge cases...
Collect and synthesize:
Generate challenge report:
## Challenge Report: Phase 3 Plan
**Mode:** Standard
**Agents:** 4
**Issues Found:** 12
### Consensus Issues (All agents agree)
- No rollback strategy for migration
### Contested Issues (Agents disagree)
- Database choice (Critic: ok, Devil's Advocate: reconsider)
### By Severity
- 🔴 Critical: 2
- 🟠 High: 4
- 🟡 Medium: 4
- 🟢 Low: 2
### Recommended Actions
1. Add migration rollback plan
2. Revisit database decision with team
3. Add rate limiting before deploy
Offer resolution paths:
What would you like to do?
1. Address critical issues now
2. Add issues to plan as tasks
3. Dismiss with documented reason
4. Run aggressive mode for deeper analysis
<output_files>
Creates .planning/challenges/ with:
.planning/challenges/
├── {date}-{target}-challenge.md # Full report
├── unresolved.md # Tracked issues
└── dismissed.md # Dismissed with reasons
Unresolved issues integrate with /gywd:consider-issues.
</output_files>
/gywd:challenge .planning/phases/03-payment/03-01-PLAN.md
# Review issues
# Decide which to address
/gywd:execute-plan .planning/phases/03-payment/03-01-PLAN.md
/gywd:plan-phase 3 --with-challenge
# Plan created AND challenged in one step
/gywd:challenge src/services/payment/
# Adversarial code review
/gywd:challenge --watch
# Challenge every commit automatically
</integration>
<success_criteria>