Maps unfamiliar codebases in phases: structure, entry points, data flow, patterns, landmines. Use before coding in new, inherited, or revisited projects.
npx claudepluginhub gadaalabs/claude-code-on-steroidsThis skill uses the workspace's default tool permissions.
**PATHFINDER** — *A pathfinder scouts unfamiliar terrain before the team moves in.*
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
PATHFINDER — A pathfinder scouts unfamiliar terrain before the team moves in. When invoked: maps an unknown codebase in 5 phases — project structure, entry points, data flow, architectural patterns, and landmine files — before writing a single line of code.
Core principle: Never write code in a codebase you haven't mapped. 20 minutes of structured exploration prevents days of working against the grain.
Unfamiliar codebases have hidden conventions, undocumented constraints, landmine files, and established patterns. Violating them creates bugs that look mysterious but are obvious to anyone who knows the codebase.
Announce at start: "Running PATHFINDER to map this codebase before writing code."
Always explore at the shallowest level first. Only go deeper when shallow confirms relevance. Reading full files speculatively wastes context budget — a 500-line file read prematurely costs 10x more than a 50-line header read.
Depth 1 — Directory listing (cheapest):
Understand shape of the codebase. Know what exists.
Depth 2 — File headers (30-50 lines):
Imports, exports, top-level declarations reveal purpose.
Decide if deeper read is warranted.
Depth 3 — Function signatures (grep for exports/defs):
Confirm the file owns what you need before reading the body.
Depth 4 — Full source (expensive, use sparingly):
Only for files confirmed relevant at Depth 2-3.
For files >100 lines: require Depth 2 confirmation first.
For files >300 lines: require Depth 3 confirmation first.
Rule: Never full-read a file >100 lines in the first pass. Read headers, confirm relevance, then go deep.
When to use:
Read these in order. Do not skip:
# What exists at the root?
ls -la
# Package info / dependencies
cat package.json 2>/dev/null || cat pyproject.toml 2>/dev/null || \
cat Cargo.toml 2>/dev/null || cat go.mod 2>/dev/null
# Primary README
cat README.md 2>/dev/null | head -100
Extract:
# Claude-specific instructions
cat CLAUDE.md 2>/dev/null
cat .claude/CLAUDE.md 2>/dev/null
# General agent instructions
cat AGENTS.md 2>/dev/null
cat GEMINI.md 2>/dev/null
These override everything else. Read completely before proceeding.
# What was recently worked on?
git log --oneline -20
# What's the current state?
git status
git diff --stat HEAD~5..HEAD
Extract:
# Top-level structure (depth 2)
find . -maxdepth 2 -type d | grep -v node_modules | grep -v .git | \
grep -v __pycache__ | grep -v .venv | sort
Map to mental model:
TYPICAL LAYOUTS:
Next.js/React: app/ components/ lib/ public/ styles/
Python backend: src/ tests/ scripts/ docs/ config/
ML project: data/ notebooks/ src/ models/ experiments/
Embedded/C: src/ include/ drivers/ tests/ hal/
Go service: cmd/ internal/ pkg/ api/ handler/
# What's the main executable / server entry?
grep -r "main\|server\|app\|index" --include="*.ts" --include="*.py" \
--include="*.go" --include="*.c" -l | head -10
Trace: entry point → router/dispatcher → handlers → services → data layer.
# Where are tests?
find . -name "*.test.*" -o -name "test_*.py" -o -name "*_test.go" | \
grep -v node_modules | head -20
# Run them to establish baseline
npm test 2>/dev/null || pytest --tb=no -q 2>/dev/null || \
go test ./... 2>/dev/null || cargo test 2>/dev/null
Baseline: All tests passing? If not — document which are broken BEFORE you touch anything.
Read 2-3 existing files to extract conventions. Never guess — read the code.
# Pick 2 representative source files
# Read them completely
Detect:
component.tsx, ComponentName.tsx, component-name.tsx?getUser, fetch_user, FetchUser?MAX_RETRIES, maxRetries, MAX-RETRIES?grep -r "^import\|^from\|^require\|^use " --include="*.ts" \
--include="*.py" --include="*.rs" -l | head -5 | xargs head -20
Detect:
index.ts re-exports)?@/components vs ../components)?grep -r "try\|catch\|except\|Result\|Either\|unwrap" \
--include="*.ts" --include="*.py" --include="*.rs" -l | head -3 | xargs head -40
Detect: Exceptions? Result types? Error callbacks? Error objects vs strings?
grep -r "useState\|useStore\|zustand\|redux\|jotai\|Context" \
--include="*.tsx" --include="*.ts" -l | head -5
grep -r "prisma\|knex\|mongoose\|sqlalchemy\|diesel\|gorm" \
--include="*.ts" --include="*.py" --include="*.rs" --include="*.go" -l | head -5
These are the landmines. Find them before stepping on them.
# Files over 300 lines are often load-bearing complexity
find . -name "*.ts" -o -name "*.py" -o -name "*.go" | \
grep -v node_modules | grep -v .git | \
xargs wc -l 2>/dev/null | sort -rn | head -20
Rule: If a file > 500 lines, read its top 50 lines to understand what it owns. Don't add to it without understanding it.
# Global state, singletons, module-level variables
grep -rn "global\|singleton\|module_level\|^let \|^var " \
--include="*.ts" --include="*.py" | grep -v test | head -20
# What env vars does this need?
cat .env.example 2>/dev/null || cat .env.local 2>/dev/null || \
grep -r "process.env\|os.environ\|os.getenv" --include="*.ts" \
--include="*.py" -h | sort -u | head -20
Document: Every env var required. Which ones have no defaults and will fail silently?
# TODOs, FIXMEs, HACKs
grep -rn "TODO\|FIXME\|HACK\|XXX\|BUG\|BROKEN" \
--include="*.ts" --include="*.py" --include="*.go" --include="*.c" \
| grep -v node_modules | grep -v .git | head -30
Don't accidentally fix these without understanding why they're deferred.
Identify files that, if broken, take down the whole system:
# Find likely critical files
git log --oneline --all -- "**/*auth*" "**/*middleware*" \
"**/*migration*" "**/*schema*" 2>/dev/null | head -10
After completing all 4 phases, produce this summary before writing any code:
CODEBASE ONBOARDING COMPLETE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Project: [name — 1 sentence description]
Stack: [language + runtime + key frameworks]
Entry: [main file/command]
Tests: [framework + baseline status (X passing, Y failing)]
Conventions:
Naming: [camelCase / snake_case / PascalCase]
Imports: [absolute / relative / barrel]
Errors: [exceptions / Result types / callbacks]
State: [local / Zustand / Redux / context]
Traps found:
⚠ [trap 1 — e.g., "auth.ts:340+ lines, owns session logic, fragile"]
⚠ [trap 2 — e.g., "3 pre-existing failing tests in payments/"]
⚠ [trap 3 — e.g., "STRIPE_KEY env var required, no default"]
Critical files (read before touching):
- [path] — [why critical]
Safe to start work. ✓
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Store the summary in auto-memory:
~/.claude/projects/<hash>/memory/onboarding_<project>.md
Register in MEMORY.md so it loads next session.
Additional checks:
# Find notebooks (often undocumented experiments)
find . -name "*.ipynb" | head -10
# Data directories (don't commit these)
find . -name "*.csv" -o -name "*.parquet" -o -name "*.pkl" | \
grep -v node_modules | head -10
# Model artifacts (large files)
find . -name "*.pt" -o -name "*.pkl" -o -name "*.h5" | head -10
Traps: Notebooks often have hardcoded paths. Models often assume GPU. Data pipelines often have environment-specific config.
Additional checks:
# Build system
cat Makefile 2>/dev/null | head -50
cat CMakeLists.txt 2>/dev/null | head -50
# Target platform
grep -r "CPU\|MCU\|CORTEX\|ARM\|AVR\|STM32\|ESP32" \
--include="*.h" --include="*.cmake" | head -10
# Flash/RAM constraints
grep -r "FLASH\|RAM\|HEAP\|STACK" --include="*.ld" \
--include="*.cmake" | head -10
Traps: Linker scripts are critical. Stack sizes are often hardcoded. Clock initialization order matters.
Additional checks:
# Prompt files (often scattered)
find . -name "*.txt" -o -name "*.prompt" -o -name "prompts.py" | \
grep -v node_modules | head -10
# API key management
grep -r "OPENAI\|ANTHROPIC\|GROQ\|GEMINI" --include="*.py" \
--include="*.ts" -l | head -5
Traps: System prompts often have hidden constraints. Rate limits often not surfaced in code.
Run before:
architect — context for design decisionsblueprint — architecture decisions need codebase knowledgeoracle — includes codebase health in complexity assessmentAfter completion:
chronicle to record any unusual patterns foundMap before you build
Read before you write
Conventions before creativity
Traps before touching