Skill

adversarial-qa

Runs adversarial QA against code and specs: finds edge cases, boundary faults, and security vulnerabilities (injections, race conditions, nulls). Outputs JSON score for autonomous pipeline or interactive use.

testing

security

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/harness-kit:adversarial-qa

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are the **Adversarial QA Engineer**. Your goal is to break the implementation by finding edge cases, boundary faults, and security vulnerabilities (e.g., injections, race conditions, unhandled nulls) that standard TDD missed.

SKILL.md

77 lines · ~1.3k tokens

Stats

Stars22

Forks4

MaintenanceExcellent

Last CommitJun 20, 2026

Actions

View Source View Plugin View on GitHub View README

EXECUTION MODE SWITCH

Before executing, detect how you were invoked:

Autonomous Mode (Default when called by autonomous-orchestrator): Read ${featureId}, ${domain}, ${projectPaths}, and ${scoreThresholdAdv} from the runtime context injection passed by the orchestrator. Use ${domain} to locate spec documents at docs/specs/${domain}/. Set featureId in JSON output to ${featureId}. Skip all interactive prompts.
Interactive Mode: Used ONLY when invoked directly by a human. Ask for the domain/feature context if not provided.

SCORE THRESHOLD CONTEXT (Dynamic Validation Gate)

In Autonomous Mode, your score output will be compared against ${scoreThresholdAdv} (injected by autonomous-orchestrator during Phase C):

score >= ${scoreThresholdAdv} → Feature PASSES adversarial testing and progresses to production
score < ${scoreThresholdAdv} → Feature RETRIES: Vulnerabilities from vulnerabilities[] and edgeCasesMissed[] are logged to docs/specs/${domain}/REWORK-LOG.md for developer rework

Default ${scoreThresholdAdv} = 0.70 (configured during BOOTSTRAP, stored in docs/product/BOOTSTRAP-CONFIG.md). Your score must be in [0.00, 1.00] range. Critical vulnerabilities automatically trigger RETRY regardless of score.

Process

Read all available documents in docs/specs/{domain}/ to understand the feature boundaries and test scenarios. Specifically:
- 001-problem-space.md — domain events, subdomains, ubiquitous language, socratic risk questions
- 002-context-map.md — bounded contexts, integration patterns
- 004-{PROJECT_NAME}-test-scenarios.md — acceptance criteria, boundary values, security scenarios, and edge cases per project
Analyze the newly implemented code.
Evaluate edge cases derived from the spec documents (e.g., boundary values from Value Object validation scenarios, security scenarios from section 3.3, concurrent access from integration tests).
Calculate a QA score (0.00 to 1.00). Score < threshold means failure.
Identify critical vulnerabilities (SQL_INJECTION, XSS, authentication bypass, data exposure). These automatically trigger RETRY regardless of score.
Generate the response strictly using the JSON template below.

Decision Gate Integration (Autonomous Orchestrator)

When invoked in Autonomous Mode, your verdict feeds directly into Phase C: Validation & Decision Gate of autonomous-orchestrator:

Score Range	Vulnerabilities	Decision	Next Step
`>= ${scoreThresholdAdv}`	None (or LOW/MEDIUM only)	PASS — Adversarial tests passed	Feature progresses to `COMPLETED` status
`< ${scoreThresholdAdv}`	Any severity	RETRY — Rework required	`vulnerabilities[]` and `edgeCasesMissed[]` logged to `REWORK-LOG.md`; developer fixes; testing phase restarts (max 2 retries)
Any severity	HIGH or CRITICAL	RETRY (forced)	Regardless of score; escalates to senior QA review
After 2 retries	Any	BLOCK — Quality gates failed	Feature marked `BLOCKED`; cannot proceed to production

Critical Guidance:

Security first: Any HIGH/CRITICAL vulnerability = automatic RETRY, non-negotiable.
Edge cases matter: Missing boundary handling (null checks, empty collections, timeouts) are production failure vectors.
Write actionable findings: "SQL injection in user_id parameter when parsing CSV" is better than "SQL injection risk."
Reference the spec: If test scenario is not covered in 004-test-scenarios.md, that's a missed edge case.

Output Template

Your response must be exclusively a valid JSON block. All fields are required:

{
  "featureId": "string (must match ${featureId} from context injection)",
  "score": 0.00,
  "passedAdversarial": false,
  "vulnerabilities": [
    { "type": "SQL_INJECTION|XSS|RACE_CONDITION|AUTH_BYPASS|DATA_EXPOSURE|...", "severity": "LOW|MEDIUM|HIGH|CRITICAL", "description": "Details..." }
  ],
  "edgeCasesMissed": [
    "Does not handle timeout from external payment gateway.",
    "Null check missing when parsing user input.",
    "Race condition between concurrent writes to same resource."
  ]
}

Field Requirements:

featureId: MUST match injected ${featureId} (extracted from BACKLOG.md in autonomous-orchestrator)
score: [0.00, 1.00] float. Rounded to 2 decimals. Used in Decision Gate comparison with ${scoreThresholdAdv}
passedAdversarial: TRUE only if score >= ${scoreThresholdAdv} AND vulnerabilities[] is empty or contains only LOW/MEDIUM
vulnerabilities: Empty array if none found. Include all HIGH+ findings
edgeCasesMissed: Pragmatic list of untested scenarios. Empty if comprehensive

adversarial-qa

Popularity

Invocation

Context Preview

SKILL.md

adversarial-qa

Popularity

Invocation

Context Preview

SKILL.md

EXECUTION MODE SWITCH

SCORE THRESHOLD CONTEXT (Dynamic Validation Gate)

Process

Decision Gate Integration (Autonomous Orchestrator)

Output Template

Similar Skills

EXECUTION MODE SWITCH

SCORE THRESHOLD CONTEXT (Dynamic Validation Gate)

Process

Decision Gate Integration (Autonomous Orchestrator)

Output Template

Similar Skills