From octo
Traces codepaths in git diffs, maps to existing tests, scores coverage, visualizes gaps, and generates tests for uncovered paths in changed files.
npx claudepluginhub nyldn/claude-octopus --plugin octoThis skill uses the workspace's default tool permissions.
Trace every codepath in a diff, map each path against existing tests, visualize coverage gaps, and auto-generate tests for uncovered paths.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Trace every codepath in a diff, map each path against existing tests, visualize coverage gaps, and auto-generate tests for uncovered paths.
Core principle: Trace codepaths in changed files -> Map against existing tests -> Score coverage quality -> Generate tests for gaps -> Report before/after counts.
These hard limits prevent runaway analysis:
Determine the diff scope. Use the most relevant source:
# PR diff
git diff --name-only main...HEAD
# Staged changes
git diff --name-only --cached
# Last commit
git diff --name-only HEAD~1..HEAD
Filter to source code files only (exclude configs, docs, generated files).
For each changed file, you MUST trace:
if/else, switch/case, ternary, and pattern match. Each branch is a separate codepath.catch, throw, error return, validation failure, and early return with error. WHY: Error paths are the most common source of untested bugs.Produce a structured inventory:
## Codepath Inventory: [filename]
| # | Path Description | Type | Risk |
|---|-----------------|------|------|
| 1 | validateUser() happy path | conditional | low |
| 2 | validateUser() missing email | error | medium |
| 3 | validateUser() invalid format | error | medium |
| 4 | processOrder() empty cart guard | guard | high |
| 5 | processOrder() payment timeout | error | high |
| 6 | processOrder() success | conditional | low |
Type categories: conditional, error, guard, loop-boundary, integration, async
Risk assessment: high = user-facing failure or data loss, medium = degraded behavior, low = cosmetic or logging
For each file in the diff, search the test directory for related tests:
# Find test files that reference the changed file or its exports
# Search by filename pattern
find tests/ -name "*[changed_file_stem]*" -type f
# Search by import/require of the changed module
grep -rl "import.*from.*[module_name]" tests/
grep -rl "require.*[module_name]" tests/
# Search by function name references
grep -rl "[function_name]" tests/
For each codepath, assess existing test coverage with this rubric:
| Rating | Meaning | Criteria |
|---|---|---|
| ★★★ | Behavior + edge cases tested | Tests assert behavior AND cover boundary conditions, error cases, and edge inputs |
| ★★ | Happy path tested | Tests cover the success path but miss error branches or edge cases |
| ★ | Smoke test only | Test exists but only checks the function runs without error (no meaningful assertions) |
| ☆ | No test found | No test references this codepath at all |
Map each codepath to its test coverage:
## Coverage Map: [filename]
| # | Codepath | Test File | Rating | Notes |
|---|----------|-----------|--------|-------|
| 1 | validateUser() happy path | test-user.sh:42 | ★★★ | Asserts valid + invalid inputs |
| 2 | validateUser() missing email | test-user.sh:58 | ★★ | Tests missing, not malformed |
| 3 | validateUser() invalid format | -- | ☆ | No test for format validation |
| 4 | processOrder() empty cart guard | -- | ☆ | Guard clause untested |
| 5 | processOrder() payment timeout | test-orders.sh:30 | ★ | Checks no crash, no assertions |
| 6 | processOrder() success | test-orders.sh:15 | ★★★ | Full integration test |
After completing the map, produce an ASCII coverage summary. This is the primary output artifact.
COVERAGE: 5/12 paths tested (42%)
Code paths: 3/5 (60%)
User flows: 2/7 (29%)
GAPS: 7 paths need tests
Break down by category:
BY TYPE:
conditional: 3/4 tested (75%) ████████░░
error: 1/5 tested (20%) ██░░░░░░░░
guard: 0/2 tested (0%) ░░░░░░░░░░
integration: 1/1 tested (100%) ██████████
BY RISK:
high: 1/3 tested (33%) ███░░░░░░░
medium: 2/5 tested (40%) ████░░░░░░
low: 2/4 tested (50%) █████░░░░░
Use full block for covered and light shade for uncovered. 10-character bar. Always show exact fractions and percentages.
Before generating any tests, you MUST detect the project's testing patterns:
**Detected Test Conventions:**
- Framework: [jest/vitest/pytest/bash/go test/etc.]
- Location: [tests/ | __tests__/ | src/**/*.test.* | etc.]
- Naming: [test-*.sh | *.test.ts | *_test.go | etc.]
- Style: [BDD describe/it | xUnit | TAP | custom]
- Helpers: [test-utils.ts | conftest.py | helpers/ | etc.]
- Assertion library: [built-in | chai | assert | etc.]
For each no-test and smoke-only codepath, generate a test that:
For each generated test, show:
### Generated: test for [codepath description]
**Covers:** Codepath #N from [filename]
**Raises coverage:** from no-test to full coverage
[test code block]
After generating all tests, show the coverage change:
BEFORE: 5/12 paths tested (42%)
AFTER: 11/12 paths tested (92%)
New tests generated: 6
Remaining gaps: 1 (manual review needed)
Coverage audit runs as a complement to code review. When invoked during deliver phase:
If coverage audit finds gaps in new code, recommend the user adopt TDD for the next iteration. Coverage audit fixes existing gaps; TDD prevents future ones.
After generating tests, use skill-verify to run the test suite and confirm the new tests pass.
| Action | Why It Is Wrong |
|---|---|
| Count lines instead of paths | Line coverage misses branch coverage entirely |
| Generate tests without checking conventions | Tests that do not match project style will be rejected |
| Test implementation details | Brittle tests that break on refactoring |
| Skip error paths | Error paths are where most bugs live |
| Exceed the 30-path cap | Analysis becomes unfocused and slow |
| Generate more than 20 tests | Diminishing returns; focus on highest impact |
| Spend more than 2 min on one path | Mark as needs-manual-review and move on |
1. TRACE -> Identify all codepaths in the diff (max 30)
2. MAP -> Find existing tests for each path
3. SCORE -> Rate coverage quality (no-test / smoke / happy-path / full)
4. DIAGRAM -> ASCII coverage visualization
5. GENERATE -> Auto-create tests for gaps (max 20)
6. REPORT -> Before/after test counts