From jerry
Performs on-demand adversarial quality reviews of deliverables using criticality-based (C1-C4) strategy templates, executes templates, and scores quality via LLM-as-Judge rubric. Integrates with quality-enforcement SSOT.
npx claudepluginhub geekatron/jerry --plugin jerryThis skill is limited to using the following tools:
> **Version:** 1.0.0
PLAYBOOK.mdagents/adv-executor.governance.yamlagents/adv-executor.mdagents/adv-scorer.governance.yamlagents/adv-scorer.mdagents/adv-selector.governance.yamlagents/adv-selector.mdcomposition/adv-executor.agent.yamlcomposition/adv-executor.prompt.mdcomposition/adv-scorer.agent.yamlcomposition/adv-scorer.prompt.mdcomposition/adv-selector.agent.yamlcomposition/adv-selector.prompt.mdCreates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Version: 1.0.0 Framework: Jerry Adversarial Quality (ADV) Constitutional Compliance: Jerry Constitution v1.0 SSOT Reference:
.context/rules/quality-enforcement.md
This SKILL.md serves multiple audiences:
| Level | Audience | Sections to Focus On |
|---|---|---|
| L0 (ELI5) | New users, stakeholders | Purpose, When to Use, Routing Disambiguation, When to Use /adversary vs ps-critic, Quick Reference |
| L1 (Engineer) | Developers invoking agents | Invoking an Agent, Available Agents, Dependencies, Adversarial Quality Mode |
| L2 (Architect) | Workflow designers | P-003 Compliance, H-14 Integration, Constitutional Compliance, Strategy Templates |
The Adversary skill provides on-demand adversarial quality reviews using strategy templates from the Jerry quality framework. Unlike the problem-solving skill's integrated adversarial mode (which operates within creator-critic loops), the adversary skill is invoked explicitly when you need a standalone adversarial assessment of any deliverable.
.context/templates/adversarial/Activate when:
NEVER invoke this skill when:
/problem-solving with ps-critic instead/problem-solving for root-cause analysis insteadSee Routing Disambiguation for full exclusion conditions with consequences.
Note: Use
/adversaryfor adversarial code review (e.g., red team security review, tournament quality assessment of code artifacts). Useps-reviewerfor routine defect detection.
The adv-scorer and ps-critic agents share the same S-014 LLM-as-Judge rubric and 6-dimension weighted composite scoring methodology. They serve different workflow positions:
| Aspect | adv-scorer | ps-critic |
|---|---|---|
| Workflow | Standalone/on-demand scoring | Embedded in creator-critic-revision loops |
| Output | Focused score report with L0 summary | L0/L1/L2 multi-level critique report |
| Iteration | May be invoked once or re-invoked for re-scoring | Iterates within the H-14 cycle |
| Invocation | Via /adversary skill | Via /problem-solving skill |
Both agents produce comparable scores from the same rubric; the choice depends on whether you need standalone assessment (adv-scorer) or iterative critique with revision guidance (ps-critic).
| Agent | Role | Model | Output Location |
|---|---|---|---|
adv-selector | Strategy Selector - Maps criticality to strategy sets | haiku | Strategy selection plan |
adv-executor | Strategy Executor - Runs strategy templates against deliverables | sonnet | Strategy execution reports |
adv-scorer | Quality Scorer - LLM-as-Judge rubric scoring | sonnet | Quality score reports |
All adversary agents are workers, NOT orchestrators. The MAIN CONTEXT (Claude session) orchestrates the workflow.
P-003 AGENT HIERARCHY:
======================
+-------------------+
| MAIN CONTEXT | <-- Orchestrator (Claude session)
| (orchestrator) |
+-------------------+
| | |
v v v
+------+ +------+ +------+
| adv- | | adv- | | adv- | <-- Workers (max 1 level)
|select| |exec | |scorer|
+------+ +------+ +------+
Agents CANNOT invoke other agents.
Agents CANNOT spawn subagents.
Only MAIN CONTEXT orchestrates the sequence.
Simply describe what you need:
"Run an adversarial review of this ADR at C3 criticality"
"Score this deliverable with LLM-as-Judge"
"What strategies should I apply for a C2 review?"
"Run Devil's Advocate and Steelman on this design document"
"Execute a full C4 tournament review on the architecture proposal"
The orchestrator will select the appropriate agent(s) based on keywords and context.
Request a specific agent:
"Use adv-selector to pick strategies for C3 criticality"
"Have adv-executor run S-002 Devil's Advocate on the ADR"
"I need adv-scorer to produce a quality score for this synthesis"
For programmatic invocation within workflows:
Task(
description="adv-selector: Strategy selection for C3",
subagent_type="general-purpose",
prompt="""
You are the adv-selector agent (v1.0.0).
## ADV CONTEXT (REQUIRED)
- **Criticality Level:** C3
- **Deliverable Type:** Architecture Decision Record
- **Deliverable Path:** docs/decisions/adr-042-persistence.md
## MANDATORY PERSISTENCE (P-002)
Create file at: {output_path}
## TASK
Select the strategy set for C3 criticality per SSOT.
"""
)
The adversary skill depends on external artifacts created by other enablers. These MUST be in place before the skill is fully operational.
All 10 strategy templates in .context/templates/adversarial/ are created by separate enablers:
| Template | Source Enabler | Status |
|---|---|---|
s-001-red-team.md | EN-809 | Created by EN-809 |
s-002-devils-advocate.md | EN-806 | Created by EN-806 |
s-003-steelman.md | EN-807 | Created by EN-807 |
s-004-pre-mortem.md | EN-808 | Created by EN-808 |
s-007-constitutional-ai.md | EN-805 | Created by EN-805 |
s-010-self-refine.md | EN-804 | Created by EN-804 |
s-011-cove.md | EN-809 | Created by EN-809 |
s-012-fmea.md | EN-808 | Created by EN-808 |
s-013-inversion.md | EN-808 | Created by EN-808 |
s-014-llm-as-judge.md | EN-803 | Created by EN-803 |
Naming Convention: Templates follow the pattern s-{NNN}-{slug}.md where {NNN} is the strategy ID from the quality-enforcement SSOT and {slug} is a hyphenated descriptor (e.g., s-002-devils-advocate.md). These are static reference documents versioned alongside the codebase — they are not dynamically generated.
Fallback behavior: If a template file is not found, adv-executor MUST:
The skill skeleton (EN-802) defines the structure; the template enablers populate the content.
.context/rules/quality-enforcement.md -- MUST exist. All thresholds, strategy IDs, criticality levels, and quality dimensions are sourced from here.SSOT Reference:
.context/rules/quality-enforcement.md-- all thresholds, strategy IDs, criticality levels, and quality dimensions are defined there. NEVER hardcode values; always reference the SSOT.
The quality framework provides 10 selected adversarial strategies across 4 mechanistic families. See .context/rules/quality-enforcement.md (Strategy Catalog section) for the authoritative list.
| Family | Strategies | Adversary Application |
|---|---|---|
| Iterative Self-Correction | S-014 (LLM-as-Judge), S-007 (Constitutional AI Critique), S-010 (Self-Refine) | Quality scoring, constitutional compliance, self-review |
| Dialectical Synthesis | S-003 (Steelman Technique) | Strengthen arguments before critique (H-16 REQUIRED) |
| Role-Based Adversarialism | S-002 (Devil's Advocate), S-004 (Pre-Mortem Analysis), S-001 (Red Team Analysis) | Challenge assumptions, anticipate failures, adversarial exploration |
| Structured Decomposition | S-013 (Inversion Technique), S-012 (FMEA), S-011 (Chain-of-Verification) | Systematic failure mode analysis, verification chains |
All strategies use standardized templates from .context/templates/adversarial/:
| Template | Strategy | Purpose |
|---|---|---|
s-001-red-team.md | S-001 Red Team Analysis | Adversarial exploration of attack surfaces |
s-002-devils-advocate.md | S-002 Devil's Advocate | Challenge assumptions and key claims |
s-003-steelman.md | S-003 Steelman Technique | Strengthen the best version of the argument |
s-004-pre-mortem.md | S-004 Pre-Mortem Analysis | Anticipate failure modes |
s-007-constitutional-ai.md | S-007 Constitutional AI Critique | Constitutional compliance verification |
s-010-self-refine.md | S-010 Self-Refine | Iterative self-improvement |
s-011-cove.md | S-011 Chain-of-Verification | Systematic claim verification |
s-012-fmea.md | S-012 FMEA | Failure Mode and Effects Analysis |
s-013-inversion.md | S-013 Inversion Technique | Invert key claims to find blind spots |
s-014-llm-as-judge.md | S-014 LLM-as-Judge | Rubric-based quality scoring |
Per SSOT, strategy activation follows criticality levels:
| Level | Required Strategies | Optional Strategies |
|---|---|---|
| C1 (Routine) | S-010 | S-003, S-014 |
| C2 (Standard) | S-007, S-002, S-014 | S-003, S-010 |
| C3 (Significant) | C2 + S-004, S-012, S-013 | S-001, S-003, S-010, S-011 |
| C4 (Critical) | All 10 selected strategies | None (all required) |
HARD rule: S-003 (Steelman) MUST be applied before S-002 (Devil's Advocate). Always strengthen the argument before challenging it.
The SSOT defines 6 quality dimensions with weights:
| Dimension | Weight |
|---|---|
| Completeness | 0.20 |
| Internal Consistency | 0.20 |
| Methodological Rigor | 0.20 |
| Evidence Quality | 0.15 |
| Actionability | 0.15 |
| Traceability | 0.10 |
Threshold: >= 0.92 weighted composite for C2+ deliverables (H-13)
Leniency bias counteraction: Score strictly against rubric criteria. When uncertain between adjacent scores, choose the lower one.
Both /adversary and ps-critic apply adversarial strategies, but serve different workflow positions:
| Aspect | /adversary Skill | ps-critic Agent |
|---|---|---|
| Use Case | Standalone adversarial reviews, tournament scoring, strategy template execution | Embedded quality critique within creator-critic-revision loops |
| Invocation | Explicit on-demand (/adversary or natural language request) | Invoked by orchestrator within H-14 cycle |
| Output Focus | Strategy-specific findings (adv-executor) + quality score (adv-scorer) | L0/L1/L2 multi-level critique with dimension-level improvement guidance |
| Iteration | May be used once or re-invoked for re-scoring after revision | Iterates within the H-14 minimum 3-iteration cycle |
| Strategy Coverage | Full strategy set per criticality (C1-C4), including tournament mode (all 10) | Applies strategies appropriate to criticality, embedded in workflow |
| Agents | adv-selector, adv-executor, adv-scorer | ps-critic (single agent) |
| Output Artifacts | Strategy execution reports + quality score report | Critique report with improvement recommendations |
When to Use Each:
Use /adversary when:
Use ps-critic when:
/problem-solving or /orchestration skillsComplementary Use:
Both can work together:
ps-critic applies strategies within workflows for embedded quality cycles/adversary orchestrates cross-strategy tournament reviews for C4 critical deliverablesps-critic uses the same S-014 rubric and dimension scoring as adv-scorer for consistencyTournament mode executes all 10 adversarial strategies against a C4 (Critical) deliverable in a deterministic sequence. This is the most comprehensive review level, required for irreversible decisions.
All 10 strategies run in the recommended order from skills/adversary/agents/adv-selector.md:
Findings from all strategy execution reports are collected across all 9 executor runs. The adv-scorer agent (S-014) receives these aggregated findings as input evidence when producing the final composite score. Critical findings from any strategy block PASS regardless of score.
A C4 tournament with all 10 strategies requires approximately 11 agent invocations:
Typical duration depends on deliverable size and complexity. Expect longer processing times for large architecture documents or governance changes.
H-14 mandates a minimum 3-iteration creator-critic-revision cycle for C2+ deliverables. The adversary skill is not a revision loop manager -- it provides standalone adversarial assessment. The integration boundary is:
Prior Score context field) to track improvement across iterations./orchestration skill) tracks the iteration count per H-14.Workflow position: The adversary skill sits at the "critic" position within the H-14 cycle when used for quality scoring. It can replace or complement ps-critic depending on whether the orchestrator needs standalone scoring (adv-scorer) or iterative critique (ps-critic).
All agents adhere to the Jerry Constitution v1.0:
| Principle | Requirement | Consequence of Violation |
|---|---|---|
| P-003 | NEVER spawn recursive subagents -- max 1 level | Agent hierarchy violation; uncontrolled token consumption |
| P-020 | NEVER override user intent -- ask before destructive ops | Unauthorized action; trust erosion |
| P-022 | NEVER deceive about actions, capabilities, or confidence | Governance undermined; quality assessment invalidated |
| P-001 | NEVER present findings without evidence or rubric-based scoring | Unreliable outputs; unfounded claims propagate downstream |
| P-002 | NEVER leave outputs in transient context only -- persist to files | Context rot vulnerability; artifacts lost on session compaction |
| P-004 | NEVER omit strategy IDs, template paths, or evidence citations | Untraceable decisions; audit trail broken |
| P-011 | NEVER make findings without tying them to specific deliverable evidence | Unsupported recommendations; confidence inflated without basis |
| Need | Agent | Command Example |
|---|---|---|
| Pick strategies for criticality | adv-selector | "What strategies for C3 review?" |
| Run a specific strategy | adv-executor | "Run S-002 Devil's Advocate on this ADR" |
| Score deliverable quality | adv-scorer | "Score this deliverable with LLM-as-Judge" |
| Steelman + Devil's Advocate pair | adv-executor | "Run Steelman then Devil's Advocate on this design" |
| Full C4 tournament | All three | "Run full C4 tournament review" |
| Keywords | Likely Agent |
|---|---|
| select, pick, which strategies, criticality, C1/C2/C3/C4 | adv-selector |
| run, execute, apply, template, strategy, findings | adv-executor |
| score, judge, rubric, dimensions, threshold, 0.92 | adv-scorer |
When this skill is the wrong choice and what happens if misrouted.
| Condition | Use Instead | Consequence of Misrouting |
|---|---|---|
| Iterative creator-critic-revision loop needed | /problem-solving (ps-critic) | Adversarial one-shot assessment applied to iterative work produces premature rejection without revision pathway; ps-critic operates within H-14 revision cycles while /adversary produces standalone assessments |
| Routine code review for quick defect checks | /problem-solving (ps-reviewer) | Full adversarial strategy template execution (S-001 through S-014) applied to routine defect detection wastes significant context budget on strategy selection and template loading |
| Constraint validation (pass/fail compliance) | /problem-solving (ps-validator) | Adversarial strategies assess quality dimensions, not binary constraint compliance; ps-validator produces traceability matrices while /adversary produces quality scores |
| Research, analysis, or root cause investigation | /problem-solving (ps-researcher or ps-investigator) | Adversarial agents evaluate existing deliverables, not produce new analysis; no research methodology or causal investigation capability |
| Security-hardened software design or threat modeling | /eng-team | /adversary applies quality assessment strategies (S-001 Red Team Analysis is quality-focused); /eng-team provides STRIDE/DREAD threat modeling and OWASP compliance |
| Offensive security testing or penetration testing | /red-team | /adversary "red team" keyword refers to S-001 quality strategy; /red-team provides MITRE ATT&CK kill chain methodology for authorized penetration testing |
| C1 routine work with obvious solutions | Self-review (S-010) only | Full adversarial overhead (adv-selector, adv-executor, adv-scorer) applied to C1 routine tasks consumes disproportionate context budget for low-risk work |
| Source | Content |
|---|---|
.context/rules/quality-enforcement.md | SSOT for thresholds, strategies, criticality levels |
.context/templates/adversarial/ | Strategy execution templates |
skills/problem-solving/SKILL.md | Integrated adversarial quality mode (ps-critic) |
docs/governance/JERRY_CONSTITUTION.md | Constitutional principles |
| ADR-EPIC002-001 | Strategy selection and composite scores |
| ADR-EPIC002-002 | 5-layer enforcement architecture |
Skill Version: 1.0.0
Constitutional Compliance: Jerry Constitution v1.0
SSOT: .context/rules/quality-enforcement.md
Created: 2026-02-15