Adversarial evaluation protocol for finding flaws in tests, prompts, and code reviews. Activates when validating test quality, auditing prompts, reviewing architecture, or running convergence loops. Enforces multi-round evaluation with 3-round max, mutation testing concepts, and structured issue tracking.
From sentinelnpx claudepluginhub digistrique-solutions/strique-marketplace --plugin sentinelThis skill uses the workspace's default tool permissions.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Guides agent creation for Claude Code plugins with file templates, frontmatter specs (name, description, model), triggering examples, system prompts, and best practices.
A protocol for systematically finding flaws in work products (tests, prompts, code reviews, architecture documents) through structured adversarial review. The core insight: the person who writes code is poorly positioned to find its flaws. A separate evaluator with an adversarial mindset catches what the author misses.
Work products have predictable blind spots:
| Work Product | Common Blind Spot |
|---|---|
| Tests | Test the mock instead of the code. Assert is not None instead of specific values. |
| Prompts | Missing negative instructions (what NOT to do). Contradictory rules. |
| Code reviews | Surface-level feedback. Missing security issues. Approving because "it works." |
| Architecture | Missing failure modes. Ignoring edge cases. Optimistic assumptions. |
A single pass by the author catches 60-70% of issues. Adversarial evaluation catches the rest.
Agent A: Creates the work product (tests, prompt, code)
Agent B: Evaluates the work product adversarially (DIFFERENT session)
Agent A/C: Fixes the issues Agent B found
Agent B/D: Re-evaluates ONLY the fixes
Repeat until converged or 3 rounds reached
Agent A produces the work product and makes explicit claims:
These claims become the target for adversarial evaluation.
Agent B's ONLY job is to find flaws. Agent B must:
# Adversarial Eval: <Subject> - Round N
## Issues Found
### 1. [CRITICAL] Test `test_create_user` tests the mock, not the code
File: tests/test_users.py:42
Issue: The test mocks `UserService.create` and asserts the mock was called.
If `UserService.create` is deleted, this test still passes.
Impact: Zero test coverage for user creation logic.
Fix: Remove the mock. Call the real `create_user()` with test inputs.
Assert specific properties of the returned user object.
### 2. [HIGH] Missing error case for duplicate email
File: tests/test_users.py
Issue: No test for creating a user with an email that already exists.
Impact: Duplicate email handling is untested. Could silently overwrite users.
Fix: Add test_create_user_with_duplicate_email_raises_conflict_error.
Fix ONLY the issues Agent B found:
Re-evaluate ONLY the fixes:
Converged:
- All issues from round N are resolved
- Round N+1 found 0 new issues
Not converged:
- Repeat steps 3-4
Max rounds reached (3):
- Escalate to user with remaining issues list
For EACH test in the codebase, verify these six properties:
If I delete the implementation being tested, does this test fail?
If the test still passes after deleting the implementation, it is testing a mock, not real code. This is the most common and most dangerous test anti-pattern.
If the function returns garbage data, does this test catch it?
A test that asserts is not None or is True will pass for almost any output. Tests must assert specific, meaningful values.
# Fails the WRONG OUTPUT check -- passes for any non-None return
result = service.get_items(org_id)
assert result is not None
# Passes the WRONG OUTPUT check -- catches garbage data
result = service.get_items(org_id)
assert len(result) == 3
assert result[0].name == "Test Item"
assert result[0].status == "ACTIVE"
Is the test exercising real code, or just verifying mock setup?
# FAILS: tests the mock, not the classifier
classifier = MagicMock()
classifier.classify.return_value = Result(score=0.9)
result = classifier.classify("input")
assert result.score == 0.9 # Of course it is -- you told it to return 0.9
# PASSES: tests the real classifier
classifier = Classifier()
result = classifier.classify("multi-step complex query")
assert result.score > 0.7
assert result.needs_planning is True
Does the test assert specific, meaningful values -- not just existence?
# VAGUE: passes for almost anything
assert response.status_code == 200
assert response.json() is not None
# SPECIFIC: verifies actual behavior
assert response.status_code == 200
assert response.json()["user"]["email"] == "test@example.com"
assert response.json()["user"]["role"] == "admin"
assert "password" not in response.json()["user"]
Are error conditions, boundary values, and invalid inputs tested?
Every function needs tests for:
Does this test depend on other tests running first?
Tests must not share mutable state. Each test should set up its own preconditions and clean up after itself. Tests should be runnable in any order, individually or in parallel.
Mutation testing is the gold standard for test quality validation. The idea:
| Mutation | Example | What It Tests |
|---|---|---|
| Negate condition | if x > 0 becomes if x <= 0 | Branch coverage |
| Remove statement | Delete a function call | Side effect detection |
| Change operator | + becomes -, == becomes != | Arithmetic/comparison logic |
| Return default | Return None/[]/0 instead of computed value | Output verification |
| Swap arguments | func(a, b) becomes func(b, a) | Parameter order sensitivity |
For system prompts, agent prompts, or prompt templates:
Apply the same protocol to architecture documents and design decisions:
For each component in the architecture:
vault/evals/
YYYY-MM-DD-<subject>-round-1.md
YYYY-MM-DD-<subject>-round-2.md
YYYY-MM-DD-<subject>-round-3.md
_convergence-log.md
Track which evaluations converged and which were escalated:
# Convergence Log
| Date | Subject | Rounds | Result | Notes |
|------|---------|--------|--------|-------|
| 2026-03-16 | test-quality | 2 | Converged | All issues resolved in round 2 |
| 2026-03-18 | prompt-audit | 3 | Escalated | 1 CRITICAL remaining, needs user input |
| 2026-03-20 | arch-review | 2 | Converged | Added failure mode documentation |
is not None instead of checking the actual email value" is.