Help us improve
Share bugs, ideas, or general feedback.
From code-paper-test
Mentally executes code, skills, commands, configs line-by-line with concrete values to find bugs, logic errors, edge cases, contract violations, AI hallucinations. Verifies external calls using tools.
npx claudepluginhub camoa/claude-skills --plugin code-paper-testHow this skill is triggered — by the user, by Claude, or both
Slash command
/code-paper-test:paper-testThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Systematically test code by mentally executing it line-by-line with concrete values.
references/advanced-techniques.mdreferences/ai-code-auditing.mdreferences/blind-ab-comparison.mdreferences/common-flaws.mdreferences/contract-patterns.mdreferences/core-method.mdreferences/dependency-verification.mdreferences/hybrid-testing.mdreferences/rubric-scoring.mdreferences/severity-scoring.mdreferences/skill-and-config-testing.mdreferences/structured-3-phase.mdPerforms exhaustive 14-dimension bug hunt on Git repos using Draft context (architecture, tech-stack, product) to eliminate false positives. Delivers severity-ranked report with code evidence, data flow traces, fixes; optional regression tests. For bug finding, audits, vulnerability scans.
Subjects non-trivial code decisions to fresh-context adversarial review before finalizing. Use for high-stakes scenarios like production changes, security logic, or unfamiliar codebases.
Runs parallel specialized agents to verify implementations, run tests (unit/e2e/integration/perf/LLM), grade quality (0-10 scale), and suggest improvements. Use before merging.
Share bugs, ideas, or general feedback.
Systematically test code by mentally executing it line-by-line with concrete values.
| Target Size | Approach | Why |
|---|---|---|
| < 50 lines | Quick trace (Steps 1–7 below) | Fast, inline, sufficient for small code |
| 50–300 lines | Structured 3-phase (below) | One agent, all 3 perspectives, sequential — thorough without coordination overhead |
| 300+ lines or security-critical | /code-paper:test-team (3 agents) | Context pressure justifies splitting. Cross-challenge debate catches what one agent misses. |
| Skill/command/agent files | /code-paper:test-team | Different lenses genuinely find different things for instruction-based testing |
If the user asks for "paper test" without specifying, read the target files, count lines, and recommend the appropriate approach. For 50–300 lines, use the Structured 3-Phase mode below. Only recommend /test-team for 300+ lines, explicit "test team" requests, or security-critical code.
Read references/structured-3-phase.md for the full methodology. It runs Phase A (happy path), Phase B (edge cases, 6 categories), Phase C (adversarial, 5 categories), and Phase D (self-review) sequentially in one agent.
For small code, skip the structured phases. Just trace with concrete values.
Follow code logic with concrete test cases to find:
NOT just reading - actually run the code in your head with real values.
NEVER assume or guess - Use your tools to verify every claim:
When code calls external methods/services:
Use Read tool to check actual source files:
Read: src/Service/UserService.php
→ Find loadByEmail() method
→ Note: Returns User|null, throws InvalidArgumentException if empty
Use Grep tool to find method definitions:
Grep: "public function loadByEmail" in src/
→ Verify method exists and signature
Check interfaces for injected services:
Read: vendor/.../LoggerInterface.php
→ Verify info() method signature
DO NOT write "Assume method exists" - Actually verify or mark as UNVERIFIED RISK.
When code has relationships (extends, implements, uses):
Read parent/base classes:
Read: src/Plugin/ActionBase.php
→ Check for abstract methods
→ Verify parent constructor signature
Read interfaces:
Read: src/Handler/HandlerInterface.php
→ List all required methods
→ Note exact signatures
Check service definitions:
Read: config/services.yml
Read: modulename.services.yml
→ Verify service ID exists
→ Check tags if using service collectors
If source is unavailable (external package, closed-source):
DEPENDENCY CHECK: $externalApi->fetchData()
VERIFICATION: Unable to read source
RISK: Cannot verify method exists or return type
RECOMMENDATION: Add runtime checks for null/exceptions
Mark as risk - don't assume it works.
When code reads config values (YAML, JSON, .env, services):
CONFIG CHECK: $this->config('my_module.settings')->get('api_timeout')
Source: config/install/my_module.settings.yml
Key exists? YES / NO
Value: 30
Type match: integer (code expects int) — MATCH / MISMATCH
Default: NULL if missing — RISK: no fallback
Also check: services.yml (service IDs, arguments), routing.yml (route names), schema.yml (structure matches config).
| Test Type | When | Output |
|---|---|---|
| Happy Path | First test | Verify correct flow |
| Edge Cases | After happy path | Find boundary issues |
| Error Cases | Last | Verify error handling |
| Contract Verification | Always | Check dependencies |
When user provides code to test:
Pick concrete input values. Start with happy path, then edge cases.
SCENARIO: [Description of what we're testing]
INPUT:
$variable1 = [concrete value]
$variable2 = [concrete value]
[initial state]
Follow each line. Write the variable state after execution.
Line [N]: [code statement]
→ [variable] = [new value]
→ [state change description]
At every function call, method return, or data handoff, track type transformations:
DATA FLOW: [source] → [destination]
Input type: [what was passed]
Output type: [what was returned]
Coercion: [any implicit type change]
Risk: [what breaks if type is wrong]
Watch for:
At each conditional, note which branch is taken and why.
Line [N]: if ([condition])
→ [variable1]=[value], [variable2]=[value]
→ [evaluation] = [true/false]
→ TAKES: [if branch / else branch] (lines X-Y)
For loops, trace EACH iteration with index and values.
Line [N]: foreach ([collection] as [item])
Iteration 1: $key=[value], $item=[value]
Line [N+1]: [statement]
→ [state change]
Iteration 2: $key=[value], $item=[value]
Line [N+1]: [statement]
→ [state change]
Loop ends. Final state: [describe]
For EVERY external call (methods, services, APIs), verify:
DEPENDENCY CHECK: [service/method name]
Location: [file path or interface]
Method signature: [actual signature]
Returns: [actual return type and values]
Throws: [exceptions, when]
Side effects: [what it modifies]
VERIFICATION:
- [ ] Method exists
- [ ] Parameters correct (type, order)
- [ ] Return type handled correctly
- [ ] Edge cases considered
DO NOT ASSUME - Read the actual source code or documentation.
For classes with relationships (extends, implements, uses, injects):
CONTRACT VERIFICATION: [Class name]
Extends: [Parent class]
- [ ] All abstract methods implemented
- [ ] Parent constructor called (if required)
- [ ] Parent methods called when needed
Implements: [Interface]
- [ ] All interface methods present
- [ ] Signatures match exactly
- [ ] Return types correct
Injected Services:
- [ ] Service exists in container
- [ ] Interface methods verified
- [ ] Return types handled
Tagged Service (if applicable):
- [ ] Tag name matches collector
- [ ] Implements required interface
- [ ] Priority appropriate
See reference guide for complete contract patterns.
OUTPUT:
Return value: [what's returned]
Side effects: [database changes, API calls, etc.]
State changes: [session, cache, variables]
FLAWS FOUND:
- Line [N]: [description of issue]
FIX: [how to resolve]
- Line [N]: [description of issue]
FIX: [how to resolve]
After all scenarios, review for paths NOT exercised:
UNTESTED PATHS:
- Line [N]-[M]: [else branch / catch block / default case] — never triggered
Risk: [what this path handles]
Scenario needed: [input that would trigger it]
Coverage: [X of Y branches exercised]
Recommendation: [add scenario for critical untested paths / accept risk]
= vs == vs ===)if ($x) should be if (!$x))For modules with multiple components (ECA plugins, form systems, etc.), use coverage-driven hybrid approach.
Flow-based: Real user workflows end-to-end
Component: Each component with edge cases
Step 1: Map all components
- List every event, condition, action, service
Step 2: Design flows covering all components
- Each component in at least one flow
- 3-5 flows typically cover a module
Step 3: Add component edge cases
- For each component: scenarios NOT in flows
- Error cases, empty inputs, boundaries
- 2-4 edge cases per component
Use this template for all paper tests:
PAPER TEST: [File/Function name]
SCENARIO: [Description]
INPUT:
[variable] = [value]
[variable] = [value]
TRACE:
Line [N]: [code]
→ [variable] = [new value]
Line [N]: [conditional]
→ [evaluation] = [result]
→ TAKES: [branch]
Line [N]: [loop start]
Iteration [N]: [values]
Line [N]: [code]
→ [state]
OUTPUT:
Return: [value]
Side effects: [list]
State changes: [list]
DEPENDENCY CHECKS:
[method/service]: VERIFIED / ISSUE FOUND
Issue: [description]
CONTRACT CHECKS:
[pattern]: VERIFIED / VIOLATION
Issue: [description]
FLAWS FOUND:
- [Line N]: [issue]
FIX: [solution]
- [Line N]: [issue]
FIX: [solution]
EDGE CASES TO TEST:
1. [scenario]
2. [scenario]
All detailed guides are in references/ directory:
references/core-method.md - Complete paper testing methodreferences/dependency-verification.md - How to verify external callsreferences/contract-patterns.md - All code contract typesreferences/ai-code-auditing.md - Testing AI-generated codereferences/hybrid-testing.md - Module-level testing strategyreferences/common-flaws.md - Catalog of frequent bugsreferences/advanced-techniques.md - Progressive injects, red team testing, attack surface analysis, AAR formatreferences/severity-scoring.md - Consistent severity rubric for flaw prioritizationreferences/blind-ab-comparison.md - Comparing two implementations side by sidereferences/rubric-scoring.md - Structured grading for code quality assessmentreferences/skill-and-config-testing.md - Testing skills, commands, agents, and configsThe SKILL.md provides the core workflow. For detailed guidance:
references/core-method.mdreferences/dependency-verification.mdreferences/contract-patterns.mdreferences/ai-code-auditing.mdreferences/hybrid-testing.md