Mentally executes code, skills, commands, configs line-by-line with concrete values to find bugs, logic errors, edge cases, contract violations, AI hallucinations. Verifies external calls using tools.
From code-paper-testnpx claudepluginhub camoa/claude-skills --plugin code-paper-testThis skill is limited to using the following tools:
references/advanced-techniques.mdreferences/ai-code-auditing.mdreferences/blind-ab-comparison.mdreferences/common-flaws.mdreferences/contract-patterns.mdreferences/core-method.mdreferences/dependency-verification.mdreferences/hybrid-testing.mdreferences/rubric-scoring.mdreferences/severity-scoring.mdreferences/skill-and-config-testing.mdreferences/structured-3-phase.mdProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
Transforms raw data into narratives with story structures, visuals, and frameworks for executive presentations, analytics reports, and stakeholder communications.
Systematically test code by mentally executing it line-by-line with concrete values.
| Target Size | Approach | Why |
|---|---|---|
| < 50 lines | Quick trace (Steps 1–7 below) | Fast, inline, sufficient for small code |
| 50–300 lines | Structured 3-phase (below) | One agent, all 3 perspectives, sequential — thorough without coordination overhead |
| 300+ lines or security-critical | /code-paper:test-team (3 agents) | Context pressure justifies splitting. Cross-challenge debate catches what one agent misses. |
| Skill/command/agent files | /code-paper:test-team | Different lenses genuinely find different things for instruction-based testing |
If the user asks for "paper test" without specifying, read the target files, count lines, and recommend the appropriate approach. For 50–300 lines, use the Structured 3-Phase mode below. Only recommend /test-team for 300+ lines, explicit "test team" requests, or security-critical code.
Read references/structured-3-phase.md for the full methodology. It runs Phase A (happy path), Phase B (edge cases, 6 categories), Phase C (adversarial, 5 categories), and Phase D (self-review) sequentially in one agent.
For small code, skip the structured phases. Just trace with concrete values.
Follow code logic with concrete test cases to find:
NOT just reading - actually run the code in your head with real values.
NEVER assume or guess - Use your tools to verify every claim:
When code calls external methods/services:
Use Read tool to check actual source files:
Read: src/Service/UserService.php
→ Find loadByEmail() method
→ Note: Returns User|null, throws InvalidArgumentException if empty
Use Grep tool to find method definitions:
Grep: "public function loadByEmail" in src/
→ Verify method exists and signature
Check interfaces for injected services:
Read: vendor/.../LoggerInterface.php
→ Verify info() method signature
DO NOT write "Assume method exists" - Actually verify or mark as UNVERIFIED RISK.
When code has relationships (extends, implements, uses):
Read parent/base classes:
Read: src/Plugin/ActionBase.php
→ Check for abstract methods
→ Verify parent constructor signature
Read interfaces:
Read: src/Handler/HandlerInterface.php
→ List all required methods
→ Note exact signatures
Check service definitions:
Read: config/services.yml
Read: modulename.services.yml
→ Verify service ID exists
→ Check tags if using service collectors
If source is unavailable (external package, closed-source):
DEPENDENCY CHECK: $externalApi->fetchData()
VERIFICATION: Unable to read source
RISK: Cannot verify method exists or return type
RECOMMENDATION: Add runtime checks for null/exceptions
Mark as risk - don't assume it works.
When code reads config values (YAML, JSON, .env, services):
CONFIG CHECK: $this->config('my_module.settings')->get('api_timeout')
Source: config/install/my_module.settings.yml
Key exists? YES / NO
Value: 30
Type match: integer (code expects int) — MATCH / MISMATCH
Default: NULL if missing — RISK: no fallback
Also check: services.yml (service IDs, arguments), routing.yml (route names), schema.yml (structure matches config).
| Test Type | When | Output |
|---|---|---|
| Happy Path | First test | Verify correct flow |
| Edge Cases | After happy path | Find boundary issues |
| Error Cases | Last | Verify error handling |
| Contract Verification | Always | Check dependencies |
When user provides code to test:
Pick concrete input values. Start with happy path, then edge cases.
SCENARIO: [Description of what we're testing]
INPUT:
$variable1 = [concrete value]
$variable2 = [concrete value]
[initial state]
Follow each line. Write the variable state after execution.
Line [N]: [code statement]
→ [variable] = [new value]
→ [state change description]
At every function call, method return, or data handoff, track type transformations:
DATA FLOW: [source] → [destination]
Input type: [what was passed]
Output type: [what was returned]
Coercion: [any implicit type change]
Risk: [what breaks if type is wrong]
Watch for:
At each conditional, note which branch is taken and why.
Line [N]: if ([condition])
→ [variable1]=[value], [variable2]=[value]
→ [evaluation] = [true/false]
→ TAKES: [if branch / else branch] (lines X-Y)
For loops, trace EACH iteration with index and values.
Line [N]: foreach ([collection] as [item])
Iteration 1: $key=[value], $item=[value]
Line [N+1]: [statement]
→ [state change]
Iteration 2: $key=[value], $item=[value]
Line [N+1]: [statement]
→ [state change]
Loop ends. Final state: [describe]
For EVERY external call (methods, services, APIs), verify:
DEPENDENCY CHECK: [service/method name]
Location: [file path or interface]
Method signature: [actual signature]
Returns: [actual return type and values]
Throws: [exceptions, when]
Side effects: [what it modifies]
VERIFICATION:
- [ ] Method exists
- [ ] Parameters correct (type, order)
- [ ] Return type handled correctly
- [ ] Edge cases considered
DO NOT ASSUME - Read the actual source code or documentation.
For classes with relationships (extends, implements, uses, injects):
CONTRACT VERIFICATION: [Class name]
Extends: [Parent class]
- [ ] All abstract methods implemented
- [ ] Parent constructor called (if required)
- [ ] Parent methods called when needed
Implements: [Interface]
- [ ] All interface methods present
- [ ] Signatures match exactly
- [ ] Return types correct
Injected Services:
- [ ] Service exists in container
- [ ] Interface methods verified
- [ ] Return types handled
Tagged Service (if applicable):
- [ ] Tag name matches collector
- [ ] Implements required interface
- [ ] Priority appropriate
See reference guide for complete contract patterns.
OUTPUT:
Return value: [what's returned]
Side effects: [database changes, API calls, etc.]
State changes: [session, cache, variables]
FLAWS FOUND:
- Line [N]: [description of issue]
FIX: [how to resolve]
- Line [N]: [description of issue]
FIX: [how to resolve]
After all scenarios, review for paths NOT exercised:
UNTESTED PATHS:
- Line [N]-[M]: [else branch / catch block / default case] — never triggered
Risk: [what this path handles]
Scenario needed: [input that would trigger it]
Coverage: [X of Y branches exercised]
Recommendation: [add scenario for critical untested paths / accept risk]
= vs == vs ===)if ($x) should be if (!$x))For modules with multiple components (ECA plugins, form systems, etc.), use coverage-driven hybrid approach.
Flow-based: Real user workflows end-to-end
Component: Each component with edge cases
Step 1: Map all components
- List every event, condition, action, service
Step 2: Design flows covering all components
- Each component in at least one flow
- 3-5 flows typically cover a module
Step 3: Add component edge cases
- For each component: scenarios NOT in flows
- Error cases, empty inputs, boundaries
- 2-4 edge cases per component
Use this template for all paper tests:
PAPER TEST: [File/Function name]
SCENARIO: [Description]
INPUT:
[variable] = [value]
[variable] = [value]
TRACE:
Line [N]: [code]
→ [variable] = [new value]
Line [N]: [conditional]
→ [evaluation] = [result]
→ TAKES: [branch]
Line [N]: [loop start]
Iteration [N]: [values]
Line [N]: [code]
→ [state]
OUTPUT:
Return: [value]
Side effects: [list]
State changes: [list]
DEPENDENCY CHECKS:
[method/service]: VERIFIED / ISSUE FOUND
Issue: [description]
CONTRACT CHECKS:
[pattern]: VERIFIED / VIOLATION
Issue: [description]
FLAWS FOUND:
- [Line N]: [issue]
FIX: [solution]
- [Line N]: [issue]
FIX: [solution]
EDGE CASES TO TEST:
1. [scenario]
2. [scenario]
All detailed guides are in references/ directory:
references/core-method.md - Complete paper testing methodreferences/dependency-verification.md - How to verify external callsreferences/contract-patterns.md - All code contract typesreferences/ai-code-auditing.md - Testing AI-generated codereferences/hybrid-testing.md - Module-level testing strategyreferences/common-flaws.md - Catalog of frequent bugsreferences/advanced-techniques.md - Progressive injects, red team testing, attack surface analysis, AAR formatreferences/severity-scoring.md - Consistent severity rubric for flaw prioritizationreferences/blind-ab-comparison.md - Comparing two implementations side by sidereferences/rubric-scoring.md - Structured grading for code quality assessmentreferences/skill-and-config-testing.md - Testing skills, commands, agents, and configsThe SKILL.md provides the core workflow. For detailed guidance:
references/core-method.mdreferences/dependency-verification.mdreferences/contract-patterns.mdreferences/ai-code-auditing.mdreferences/hybrid-testing.md