Help us improve
Share bugs, ideas, or general feedback.
From test-writing
Performs adversarial review of PHPUnit unit test consensus: independently scans tests/source code first, then challenges weak findings, resurrects withdrawn ones, and uncovers missed violations.
npx claudepluginhub shopwarelabs/ai-coding-tools --plugin test-writingHow this skill is triggered — by the user, by Claude, or both
Slash command
/test-writing:phpunit-unit-test-adversarial-reviewingThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Stress-tests reviewer consensus by forming independent judgment before exposure to findings, then challenging weak consensus, resurrecting premature withdrawals, and discovering missed violations.
Orchestrates multi-agent team review of PHPUnit unit tests in 4 waves: independent analysis, peer debate, adversarial red team, and defense.
Reviews test files for bug-catching quality, grading on six dimensions like assertion depth, input coverage, mock health with actionable scorecard.
Reviews existing test suites for structural quality problems (brittleness, mock abuse, coverage illusions, slow execution, poor readability) drawing on classic engineering books.
Share bugs, ideas, or general feedback.
Stress-tests reviewer consensus by forming independent judgment before exposure to findings, then challenging weak consensus, resurrecting premature withdrawals, and discovering missed violations.
The adversarial reviewer operates on a different cognitive model than the standard reviewer. Where the reviewer applies rules systematically group-by-group, the adversary:
Input: Consensus package (required) + optional pre-formed impressions from team idle time.
Output: Structured challenges report per output-format.md.
Skip condition: If impressions input is provided (pre-formed by the adversary during idle time in team context), skip this phase entirely and proceed to Phase 2.
Read each assigned test file and its source class (from #[CoversClass]). Do NOT use MCP rule tools (list_rules, get_rules) in this phase.
Load intuitive-scan-guidance.md for heuristic lenses, then for each file:
#[CoversClass])Output per file:
impressions:
- file_path: tests/unit/Path/To/ClassTest.php
concerns:
- area: "brief description of concern"
severity: high | medium | low
Parse the consensus package provided as input:
consensus_findings, withdrawn_findings, and debate_transcript per fileThe consensus package follows the format defined in the team-reviewing skill's red-team-context.md.
Load comparison-strategies.md. For each file, contrast Phase 1 impressions against Phase 2 consensus:
Intuition-consensus gaps — Phase 1 concerns that no reviewer raised. These are the highest-value candidates for new findings. For each unmatched concern, note which area of the code it targets.
Weak consensus findings — for each consensus finding, apply the "would this survive harder pushback?" test:
Premature withdrawals — for each withdrawn finding, check:
Assumption excavation — for each consensus finding, state the unstated premise:
Output: prioritized list of candidate challenges, resurrections, and new findings — not yet evidence-backed.
For each candidate from Phase 3 (starting with highest-priority):
mcp__plugin_test-writing_test-rules__list_rules(test_type=unit, test_category={category}) to discover applicable rules in the area of concernmcp__plugin_test-writing_test-rules__get_rules(ids={relevant rule IDs}) to load detection algorithmsPromotion gate: promote a candidate to a formal challenge ONLY if a detection algorithm substantiates it. Drop candidates where the evidence doesn't hold up. This is the filter against contrarianism — intuition proposes, evidence disposes.
Endorsement: consensus findings that Phase 1 intuition independently confirmed AND that have strong detection algorithm support get endorsed. Endorsements are part of the output — they strengthen findings in the final report.
Only applicable when reviewing multiple files. Compare patterns across all assigned files:
For each rule_id that appears in any file's consensus, check if the same pattern exists in other files:
Compare treatment of similar code patterns:
Cross-file inconsistencies use the same promotion gate as Phase 4 — cite the detection algorithm.
Load output-format.md. Assemble the structured output:
CHALLENGES_RAISED if any challenges, resurrections, new findings, or cross-file inconsistenciesNO_CHALLENGES if only endorsementsFAILED if input validation or processing failedstatus: CHALLENGES_RAISED | NO_CHALLENGES | FAILED
files:
- file_path: tests/unit/Path/To/ClassTest.php
challenges_to_consensus:
- rule_id: CONV-004
consensus_was: UNANIMOUS | MAJORITY
challenge: "Detection algorithm requires X but..."
verdict_sought: overturn | weaken
resurrections:
- rule_id: DESIGN-005
originally_reported_by: reviewer-1
resurrection_argument: "The concession was premature because..."
code_evidence: "ClassTest.php:72 — ..."
new_findings:
- rule_id: ISOLATION-002
enforce: must-fix
location: ClassTest.php:88
summary: "Description"
current: |
# code
suggested: |
# fix
detection_algorithm_citation: "ISOLATION-002 specifies..."
endorsements:
- rule_id: UNIT-003
reason: "Strong finding, correctly applied"
cross_file_inconsistencies:
- rule_id: CONV-004
this_file_status: accepted
other_file: tests/unit/Other/ClassTest.php
other_file_status: flagged
inconsistency: "Same pattern, divergent treatment"
reason: null # explanation if FAILED
If the test file or source class cannot be read:
If mcp__plugin_test-writing_test-rules__list_rules or mcp__plugin_test-writing_test-rules__get_rules are unavailable:
If Phase 4 drops all candidates (none substantiated by detection algorithms):