AI-powered systematic codebase analysis. Combines mechanical structure extraction with Claude's semantic understanding to produce documentation that captures not just WHAT code does, but WHY it exists and HOW it fits into the system. Includes pattern recognition, red flag detection, flow tracing, and quality assessment. Use for codebase analysis, documentation generation, architecture understanding, or code review.
/plugin marketplace add acaprino/alfio-claude-plugins/plugin install code-review@alfio-claude-pluginsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/AI_ANALYSIS_METHODOLOGY.mdreferences/ANTIREZ_COMMENTING_STANDARDS.mdreferences/DEEP_DIVE_PLAN.mdreferences/SEMANTIC_PATTERNS.mdscripts/analyze_file.pyscripts/ast_parser.pyscripts/check_progress.pyscripts/classifier.pyscripts/comment_rewriter.pyscripts/doc_review.pyscripts/progress_tracker.pyscripts/rewrite_comments.pyscripts/usage_finder.pytemplates/analysis_report.mdtemplates/semantic_analysis.mdThis skill combines mechanical structure extraction with Claude's semantic understanding to produce comprehensive codebase documentation. Unlike simple AST parsing, this skill captures:
Mechanical Analysis (Scripts):
Semantic Analysis (Claude AI):
Documentation Maintenance:
Use this skill when:
THE DOCUMENTATION GENERATED BY THIS SKILL IS THE ABSOLUTE AND UNQUESTIONABLE SOURCE OF TRUTH FOR YOUR PROJECT.
ANY INFORMATION NOT VERIFIED WITH IRREFUTABLE EVIDENCE FROM SOURCE CODE IS FALSE, UNRELIABLE, AND UNACCEPTABLE.
╔══════════════════════════════════════════════════════════════════════════════╗
║ VERIFICATION TRUST MODEL ║
╠══════════════════════════════════════════════════════════════════════════════╣
║ Layer 1: TOOL-VALIDATED ║
║ └── Automated checks: file exists, line in range, AST symbol match ║
║ └── Marker: [VALIDATED: file.py:123 @ 2025-12-20] ║
║ ║
║ Layer 2: HUMAN-VERIFIED ║
║ └── Manual review: semantic correctness, behavior match ║
║ └── Marker: [VERIFIED: file.py:123 by @reviewer @ 2025-12-20] ║
║ ║
║ Layer 3: RUNTIME-CONFIRMED ║
║ └── Log/trace evidence of actual behavior ║
║ └── Marker: [CONFIRMED: trace_id=abc123 @ 2025-12-20] ║
╚══════════════════════════════════════════════════════════════════════════════╝
Tool validation catches STRUCTURAL issues (file moved, line shifted, symbol renamed).
Human verification ensures SEMANTIC correctness (code does what doc says).
Runtime confirmation proves BEHAVIORAL truth (system actually works this way).
ALL THREE LAYERS are required for critical documentation.
╔══════════════════════════════════════════════════════════════════════════════╗
║ DOCUMENTATION = f(SOURCE_CODE) + VERIFICATION ║
║ ║
║ If NOT verified_against_code(statement) → statement is FALSE ║
║ If NOT exists_in_codebase(reference) → reference is FABRICATED ║
║ If NOT traceable_to_source(claim) → claim is SPECULATION ║
╚══════════════════════════════════════════════════════════════════════════════╝
[UNVERIFIED - REQUIRES CODE CHECK]| Documentation Type | Required Evidence |
|---|---|
| Enum/State values | Exact match with source code enum definition |
| Function behavior | Code path tracing, actual implementation reading |
| Constants/Timeouts | Variable definition in source with file:line |
| Message formats | Message class definition, field validation |
| Architecture claims | Import graph analysis, actual class relationships |
| Flow diagrams | Verified against runtime logs OR code path analysis |
Every section of documentation MUST have one of these status markers:
[VERIFIED: source_file.py:123] - Confirmed against source code[VERIFIED: trace_id=xyz] - Confirmed against runtime logs[UNVERIFIED] - Requires verification before trusting[DEPRECATED] - Code has changed, documentation outdatedUNVERIFIED documentation is UNTRUSTED documentation.
DOCUMENTATION DESCRIBES ONLY THE CURRENT STATE OF THE ART.
NO HISTORY. NO ARCHAEOLOGY. NO "WAS". ONLY "IS".
╔══════════════════════════════════════════════════════════════════════════════╗
║ THE TEMPORAL PURITY PRINCIPLE ║
╠══════════════════════════════════════════════════════════════════════════════╣
║ Documentation = PRESENT_TENSE(current_implementation) ║
║ ║
║ FORBIDDEN: ║
║ ✗ "was/were/previously/formerly/used to" ║
║ ✗ "deprecated since version X" → just REMOVE it ║
║ ✗ "changed from X to Y" → only describe Y ║
║ ✗ "in the old system..." → irrelevant, delete ║
║ ✗ inline changelogs → use CHANGELOG.md or git ║
║ ║
║ REQUIRED: ║
║ ✓ Present tense: "The system uses..." not "The system used..." ║
║ ✓ Current state only: Document what IS, not what WAS ║
║ ✓ Git for archaeology: History lives in version control, not docs ║
╚══════════════════════════════════════════════════════════════════════════════╝
The Rule:
When you find documentation containing historical language, DELETE IT. Git blame exists for archaeology. Documentation exists for the present.
Extract structure, dependencies, and usages for one file:
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
--file src/utils/circuit_breaker.py \
--output-format markdown
Parameters:
--file / -f: Relative path to file to analyze - REQUIRED--output-format / -o: Output format (json, markdown, summary) - default: summary--find-usages / -u: Also find all usages of exported symbols - default: false--update-progress / -p: Update analysis_progress.json - default: falseOutput includes:
View analysis progress by phase:
python .claude/skills/deep-dive-analysis/scripts/check_progress.py \
--phase 1 \
--status pending
Parameters:
--phase / -p: Filter by phase number (1-7)--status / -s: Filter by status (pending, analyzing, done, blocked)--classification / -c: Filter by classification (critical, high-complexity, standard, utility)--verification-needed: Show only files needing runtime verificationFind all usages of a symbol across the codebase:
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
--symbol CircuitBreaker \
--file src/utils/circuit_breaker.py
Generate documentation for an entire phase:
python .claude/skills/deep-dive-analysis/scripts/analyze_file.py \
--phase 1 \
--output-format markdown \
--output-file docs/01_domains/COMMON_LIBRARY.md
Discover all documentation files and generate health report:
python .claude/skills/deep-dive-analysis/scripts/doc_review.py scan \
--path docs/ \
--output doc_health_report.json
Output includes:
Find all broken links in documentation:
python .claude/skills/deep-dive-analysis/scripts/doc_review.py validate-links \
--path docs/ \
--fix # Optional: auto-remove broken links
Actions:
](../path/to/file.md)--fix: removes or updates broken referencesVerify documentation accuracy against actual source code:
python .claude/skills/deep-dive-analysis/scripts/doc_review.py verify \
--doc docs/agents/lifecycle.md \
--source src/agents/lifecycle.py
Verification includes:
Refresh SEARCH_INDEX.md and BY_DOMAIN.md with current file counts:
python .claude/skills/deep-dive-analysis/scripts/doc_review.py update-indexes \
--search-index docs/00_navigation/SEARCH_INDEX.md \
--by-domain docs/00_navigation/BY_DOMAIN.md
Updates:
Run complete Phase 8 workflow:
python .claude/skills/deep-dive-analysis/scripts/doc_review.py full-maintenance \
--path docs/ \
--auto-fix \
--output doc_health_report.json
Executes in order:
These commands analyze and rewrite code comments following the antirez commenting standards.
Analyze comments in a single file:
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py analyze \
src/main.py \
--report
Options:
--report / -r: Generate detailed markdown report--json: Output as JSON for programmatic use--issues-only / -i: Show only problematic commentsOutput includes:
Analyze all Python files in a directory:
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py scan \
src/ \
--recursive \
--issues-only
Options:
--recursive / -r: Include subdirectories--issues-only / -i: Show only files with issues--json: Output as JSONCreate comprehensive markdown report for entire codebase:
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py report \
src/ \
--output comment_health.md
Report includes:
Apply comment improvements to a file:
# Dry run (preview changes)
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite \
src/main.py
# Apply changes with backup
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py rewrite \
src/main.py \
--apply \
--backup
Options:
--apply / -a: Actually modify the file (default: dry run)--backup / -b: Create .bak backup before modifying--output / -o: Write to different file instead of in-placeActions taken:
Display the antirez commenting standards:
python .claude/skills/deep-dive-analysis/scripts/rewrite_comments.py standards
Shows the complete taxonomy of good vs bad comments with examples.
| Type | Category | Description | Action |
|---|---|---|---|
| function | GOOD | API docs at function/class top | Keep/Enhance |
| design | GOOD | File-level algorithm explanations | Keep |
| why | GOOD | Explains reasoning behind code | Keep |
| teacher | GOOD | Educates about domain concepts | Keep |
| checklist | GOOD | Reminds of coordinated changes | Keep |
| guide | GOOD | Section dividers, structure | Keep sparingly |
| trivial | BAD | Restates what code says | Delete |
| debt | BAD | TODO/FIXME without plan | Rewrite/Resolve |
| backup | BAD | Commented-out code | Delete |
1. SCAN
├── Run: rewrite_comments.py scan <dir> --recursive
├── Review files with most issues
└── Generate: rewrite_comments.py report <dir> --output report.md
2. TRIAGE
├── Identify high-priority files (critical modules)
├── Focus on DEBT comments (convert to issues or design docs)
└── Plan bulk TRIVIAL/BACKUP deletions
3. REWRITE
├── Run: rewrite_comments.py rewrite <file> --apply --backup
├── Review changes in diff
└── Verify no functional changes
4. VERIFY
├── Run tests to confirm no breakage
├── Re-scan to confirm improvements
└── Update comment_health.md report
| Classification | Criteria | Verification |
|---|---|---|
| Critical | Handles authentication, security, encryption, sensitive data | Mandatory |
| High-Complexity | >300 LOC, >5 dependencies, state machines, async patterns | Mandatory |
| Standard | Normal business logic, data models, utilities | Recommended |
| Utility | Pure functions, helpers, constants | Optional |
This skill leverages Claude's code comprehension capabilities for deep semantic analysis beyond mechanical structure extraction.
╔══════════════════════════════════════════════════════════════════════════════╗
║ STRUCTURE vs MEANING ║
╠══════════════════════════════════════════════════════════════════════════════╣
║ ║
║ Scripts extract STRUCTURE: "class Foo with method bar()" ║
║ Claude extracts MEANING: "Foo implements Repository pattern for ║
║ caching user sessions with TTL expiration" ║
║ ║
║ NEVER stop at structure. ALWAYS pursue understanding. ║
║ ║
╚══════════════════════════════════════════════════════════════════════════════╝
| Layer | What | Who Does It |
|---|---|---|
| 1. WHAT | Classes, functions, imports | Scripts (AST) |
| 2. HOW | Algorithm details, data flow | Claude's first pass |
| 3. WHY | Business purpose, design decisions | Claude's deep analysis |
| 4. WHEN | Triggers, lifecycle, concurrency | Claude's behavioral analysis |
| 5. CONSEQUENCES | Side effects, failure modes | Claude's systems thinking |
For every code unit, Claude must answer:
Identity:
Behavior:
Integration:
Quality:
Claude should actively recognize and document common patterns:
| Pattern Type | Examples | Documentation Focus |
|---|---|---|
| Architectural | Repository, Service, CQRS, Event-Driven | Responsibilities, boundaries |
| Behavioral | State Machine, Strategy, Observer, Chain | Transitions, variations |
| Resilience | Circuit Breaker, Retry, Bulkhead, Timeout | Thresholds, fallbacks |
| Data | DTO, Value Object, Aggregate | Invariants, relationships |
| Concurrency | Producer-Consumer, Worker Pool | Thread safety, backpressure |
See references/SEMANTIC_PATTERNS.md for detailed recognition guides.
Claude should actively flag these issues:
ARCHITECTURE:
⚠ GOD CLASS: >10 public methods or >500 LOC
⚠ CIRCULAR DEPENDENCY: A → B → C → A
⚠ LEAKY ABSTRACTION: Implementation details in interface
RELIABILITY:
⚠ SWALLOWED EXCEPTION: Empty catch blocks
⚠ MISSING TIMEOUT: Network calls without timeout
⚠ RACE CONDITION: Shared mutable state without sync
SECURITY:
⚠ HARDCODED SECRET: Passwords, API keys in code
⚠ SQL INJECTION: String concatenation in queries
⚠ MISSING VALIDATION: Unsanitized user input
Use templates/semantic_analysis.md for comprehensive per-file analysis that includes:
1. SCRIPTS RUN FIRST
├── classifier.py → File classification
├── ast_parser.py → Structure extraction
└── usage_finder.py → Cross-references
2. CLAUDE ANALYZES
├── Read actual source code
├── Apply semantic questions
├── Recognize patterns
├── Identify red flags
└── Trace flows
3. CLAUDE DOCUMENTS
├── Use semantic_analysis.md template
├── Explain WHY, not just WHAT
├── Document contracts and invariants
└── Flag concerns with severity
4. VERIFY
├── Check against runtime behavior
├── Validate with code traces
└── Mark verification status
references/AI_ANALYSIS_METHODOLOGY.md - Complete analysis methodologyreferences/SEMANTIC_PATTERNS.md - Pattern recognition guidetemplates/semantic_analysis.md - Per-file analysis templateWhen analyzing a file, follow this sequence:
1. CLASSIFY
├── Count lines of code
├── Count dependencies
├── Check for critical patterns (auth, security, encryption)
└── Assign classification
2. READ & MAP
├── Parse AST to extract structure
├── Identify classes and their methods
├── Identify standalone functions
├── Find global variables and constants
└── Detect state mutations
3. DEPENDENCY CHECK
├── Internal imports (from project modules)
├── External imports (third-party)
└── External calls (database, network, filesystem, messaging, ipc)
4. CONTEXT ANALYSIS
├── Where are exported symbols used?
├── What modules import this file?
└── What message types flow through here?
5. RUNTIME VERIFICATION (if Critical/High-Complexity)
├── Use log analysis to trace actual behavior
├── Verify documented flow matches actual flow
└── Note any discrepancies
6. DOCUMENTATION
├── Update analysis_progress.json
├── Generate module report section
└── Cross-reference with CONTEXT.md
For runtime verification of critical/high-complexity files, use your project's log aggregation system:
The goal is to confirm that code paths match documented behavior through runtime evidence.
{
"file": "src/utils/circuit_breaker.py",
"classification": "critical",
"metrics": {
"lines_of_code": 245,
"num_classes": 2,
"num_functions": 8,
"num_dependencies": 12
},
"structure": {
"classes": [...],
"functions": [...],
"constants": [...]
},
"dependencies": {
"internal": [...],
"external": [...],
"external_calls": [...]
},
"usages": [...],
"verification_required": true
}
The markdown output follows the template in templates/analysis_report.md and produces sections suitable for inclusion in phase deliverable documents.
--update-progress when completing analysisdoc_review.py scan to understand current statedoc_health_report.json as evidence of completionWhen invoking Phase 8 documentation maintenance, follow this sequence:
1. PLANNING
├── Run: doc_review.py scan --path docs/
├── Review health report
├── Identify priority fixes (broken links, obsolete files)
└── Create todo list with specific actions
2. EXECUTION (in batches)
├── Batch 1: Fix broken links
│ └── Run: doc_review.py validate-links --fix
├── Batch 2: Verify critical docs against source
│ └── Run: doc_review.py verify --doc <file> --source <code>
├── Batch 3: Delete obsolete files
│ └── Manual review + deletion
├── Batch 4: Update navigation indexes
│ └── Run: doc_review.py update-indexes
└── Batch 5: Update timestamps
└── Set last_updated on verified files
3. VERIFICATION
├── Run: doc_review.py scan (confirm improvements)
├── Run: doc_review.py validate-links (confirm zero broken)
└── Generate final doc_health_report.json
scripts/ - Python analysis tools
analyze_file.py - Source code analysis (Phases 1-7)check_progress.py - Progress trackingdoc_review.py - Documentation maintenance (Phase 8)comment_rewriter.py - Comment analysis engine (antirez standards)rewrite_comments.py - Comment quality CLI tooltemplates/ - Output templates
analysis_report.md - Module-level report templatesemantic_analysis.md - AI-powered per-file analysis templatereferences/ - Analysis methodology docs
DEEP_DIVE_PLAN.md - Master analysis plan with all phase definitionsANTIREZ_COMMENTING_STANDARDS.md - Complete antirez comment taxonomyAI_ANALYSIS_METHODOLOGY.md - AI semantic analysis methodologySEMANTIC_PATTERNS.md - Pattern recognition guide for Claude