nw-agent-testing
5-layer testing approach for agent validation including adversarial testing, security validation, and prompt injection resistance
From nwnpx claudepluginhub nwave-ai/nwave --plugin nwThis skill uses the workspace's default tool permissions.
Agent Testing Framework
5-Layer Testing Approach
Layer 1: Output Quality (Unit-Level)
Validate agent produces correct, well-structured outputs for typical inputs.
Test: Agent follows workflow phases | Outputs match expected format/structure | Domain-specific rules correctly applied | Token efficiency within bounds
How: Manual invocation with representative inputs. Check against acceptance criteria in agent description.
Layer 2: Integration / Handoff Validation
Validate correct input/output between agents in workflows.
Test: Input parsing handles upstream format | Output format matches downstream expectations | Error signals propagate correctly | Subagent mode activation works (skip greet, execute autonomously)
How: End-to-end workflow execution through full agent chain (e.g., DISCUSS -> DESIGN -> DELIVER).
Layer 3: Adversarial Output Validation
Challenge validity of agent outputs rather than accepting at face value.
Test: Source verification (cited sources real and accurate?) | Bias detection (favors one approach without evidence?) | Edge case coverage | Completeness (required sections present?)
How: Peer review by -reviewer agent using structured critique dimensions.
Layer 4: Adversarial Verification (Peer Review)
Independent review to catch biases and blind spots in agent design.
Test: Definition follows validation checklist? | Redundant Claude default instructions? | Over/under-specified? | Could simpler agent achieve same results?
How: @nw-agent-builder validates via 11-point checklist or @agent-builder-reviewer runs structured review.
Layer 5: Security Validation
Test resilience against misuse and prompt injection.
Test: Tool restriction enforcement | maxTurns respected | Permission mode correctly scoped | Agent stays within declared scope
How: Frontmatter fields enforce at platform level. Verify configuration.
Prompt Injection Resistance
Claude Code platform provides injection resistance through: subagent isolation (own context, no sub-subagents) | Tool restriction via frontmatter tools | Permission modes via permissionMode | Hook-based validation (PreToolUse, PostToolUse)
Do NOT add prose-based injection defense. Configure platform features:
---
tools: Read, Glob, Grep # Only tools this agent needs
maxTurns: 30 # Prevents runaway execution
permissionMode: default # User approves dangerous actions
---
Security Validation Checklist
-
toolsrestricted to minimum necessary (least privilege) -
maxTurnsset to prevent runaway execution -
permissionModeappropriate for risk level - No
Bashunless agent requires command execution - No
Writeunless agent creates/modifies files - Description accurately describes scope
- Subagent mode handles autonomous execution correctly
- No sensitive data hardcoded in definition
Testing Workflow for New Agents
- Create with minimal definition
- Layer 1: Invoke with 2-3 representative inputs, check outputs
- Layer 2: Run in workflow chain if applicable
- Fix failures observed
- Validate: Run 11-point checklist
- Iterate: Add instructions only for observed failure modes