From agent-almanac
Finds and fixes broken internal links, dead external URLs, stale imports, missing cross-references, and orphaned files. Use when documentation or code references are invalid or outdated.
npx claudepluginhub pjt222/agent-almanacThis skill uses the workspace's default tool permissions.
---
Validates and fixes broken links in .memory_bank/ files using Python script. Checks index links and cross-references; useful after regenerations or when integrity issues arise.
Validates Markdown documentation links, detecting broken relative links, missing anchor targets, malformed URLs, and orphaned documentation files. Use before releases or doc updates.
Detects documentation drift, stale references, and cross-document inconsistencies in projects. Scans code-doc mismatches, broken links, outdated versions, and git staleness.
Share bugs, ideas, or general feedback.
Use this skill when project references have become stale:
Do NOT use for refactoring module dependencies or redesigning information architecture. This skill repairs existing references, not restructures them.
| Parameter | Type | Required | Description |
|---|---|---|---|
project_path | string | Yes | Absolute path to project root |
check_external | boolean | No | Verify external URLs (default: true, slow) |
fix_mode | enum | No | auto (fix obvious), report (document only), interactive (prompt) |
orphan_threshold | integer | No | Days since last modified to flag as orphan (default: 180) |
Find all markdown links pointing to non-existent files.
# Find all markdown files
find . -name "*.md" -type f > markdown_files.txt
# Extract all markdown links: [text](path)
grep -oP '\[.*?\]\(\K[^)]+' *.md | sort | uniq > all_links.txt
# For each link:
while read link; do
# Skip external URLs (http/https)
if [[ "$link" =~ ^https?:// ]]; then
continue
fi
# Resolve relative path
target=$(realpath -m "$link")
# Check if target exists
if [ ! -e "$target" ]; then
echo "BROKEN: $link (referenced in $file)" >> broken_internal.txt
fi
done < all_links.txt
Expected: broken_internal.txt lists all broken internal references
On failure: If realpath unavailable, manually check each link
Verify that external links are still accessible (HTTP 200 response).
# Extract external URLs
grep -ohP 'https?://[^\s\)]+' *.md | sort | uniq > external_urls.txt
# Check each URL (rate-limit to avoid bans)
while read url; do
status=$(curl -o /dev/null -s -w "%{http_code}" "$url")
if [ "$status" -ge 400 ]; then
echo "DEAD ($status): $url" >> dead_urls.txt
fi
sleep 0.5 # Rate limit
done < external_urls.txt
Expected: dead_urls.txt lists URLs returning 4xx/5xx errors
On failure: If curl unavailable or blocked, use online link checker or skip
Note: Some URLs may return 403 due to bot detection but work in browsers. Manual review required.
Check that all import/require statements reference existing modules.
JavaScript/TypeScript:
# Find all import statements
grep -rh "^import.*from ['\"]" . | sed -E "s/.*from ['\"]([^'\"]+)['\"].*/\1/" > imports.txt
# For each import:
while read import; do
# Skip node_modules and external packages
if [[ "$import" =~ ^[./] ]]; then
# Resolve to file path
target="${import}.js" # Try .js, .ts, .jsx, .tsx
if [ ! -e "$target" ]; then
echo "BROKEN IMPORT: $import" >> broken_imports.txt
fi
fi
done < imports.txt
Python:
# Find all import statements
grep -rh "^from .* import\|^import " . --include="*.py" | \
sed -E "s/from ([^ ]+) import.*/\1/" | \
sed -E "s/import ([^ ]+)/\1/" > imports.txt
# For each local import (starts with .)
# Check if module file exists
R:
# Find library() and source() calls
grep -rh "library(\\|source(" . --include="*.R" | \
sed -E 's/.*library\("([^"]+)"\).*/\1/' > packages.txt
# For source() calls, check if file exists
# For library() calls, check if package installed
Rscript -e "installed.packages()[,'Package']" > installed_packages.txt
Expected: broken_imports.txt lists all references to deleted/moved modules
On failure: If language-specific tool unavailable, manually review recent refactoring commits
Identify files that exist but are never referenced anywhere.
# Find all code files
find . -type f \( -name "*.js" -o -name "*.py" -o -name "*.R" \) > all_files.txt
# For each file:
while read file; do
basename=$(basename "$file")
# Search for references (import, require, source, href, link)
refs=$(grep -r "$basename" . --exclude-dir=node_modules --exclude-dir=.git | wc -l)
# If only 1 reference (itself):
if [ "$refs" -le 1 ]; then
# Check last modified date
last_mod=$(git log -1 --format="%ci" "$file")
# If modified more than orphan_threshold days ago
# Flag as potential orphan
echo "ORPHAN: $file (last modified: $last_mod)" >> orphans.txt
fi
done < all_files.txt
Expected: orphans.txt lists files not referenced elsewhere
On failure: If git log fails, use filesystem mtime instead
Note: Some files (e.g., CLI entry points, top-level scripts) are legitimately unreferenced but not orphans. Requires manual review.
Repair broken internal references using one of three strategies:
Strategy 1: Find Moved Files
# For each broken link, search for file by name
while read broken_link; do
filename=$(basename "$broken_link")
# Search for file in project
found=$(find . -name "$filename" | head -1)
if [ -n "$found" ]; then
# Update link to new path
old_path="$broken_link"
new_path="$found"
# Use Edit tool to replace in all markdown files
echo "FIX: $old_path -> $new_path"
fi
done < broken_internal.txt
Strategy 2: Create Redirect Stub
# If file was deleted intentionally, create redirect stub
echo "# Moved" > "$broken_link"
echo "This content moved to [new location](new_path.md)" >> "$broken_link"
Strategy 3: Remove Dead Link
# If content no longer exists, remove link (keep text)
# Replace [text](broken_link) with text (plain)
Expected: All broken internal links either fixed, redirected, or removed
On failure: If automated fix breaks context, escalate for manual review
Update import statements to reference correct paths after moves.
JavaScript Example:
// Before (broken)
import { helper } from './utils/helper';
// After (fixed — file moved to lib/)
import { helper } from './lib/helper';
For each broken import:
Expected: All imports resolve correctly; no module-not-found errors
On failure: If module was truly deleted, escalate to determine if functionality still needed
For files flagged as orphans, determine disposition:
# Orphaned Files Review
| File | Last Modified | Recommendation | Reason |
|------|---------------|----------------|--------|
| scripts/old_deploy.sh | 2024-01-05 | Archive | Replaced by CI/CD |
| src/legacy_api.js | 2023-06-12 | Delete | API v1 fully deprecated |
| bin/cli.py | 2025-12-01 | Keep | CLI entry point (unreferenced by design) |
Expected: Orphan review document created; automated decisions flagged for human approval
On failure: (N/A — document even if no clear disposition)
Summarize all broken references and fixes applied.
# Reference Repair Report
**Date**: YYYY-MM-DD
**Project**: <project_name>
**Fix Mode**: auto | report | interactive
## Broken Internal Links
- Total: X
- Fixed: Y
- Redirected: Z
- Escalated: W
Details:
- [file.md](file.md) line 45: Fixed broken link to moved doc
- [another.md](another.md) line 12: Created redirect stub
## Dead External URLs
- Total: X
- Fixed (wayback machine): Y
- Removed: Z
Details:
- https://example.com/old-page (404) → Removed
- https://api.old.com/docs (gone) → Replaced with new docs
## Broken Imports
- Total: X
- Fixed: Y
- Escalated: Z
Details:
- src/main.js line 3: Updated import path after refactor
## Orphaned Files
- Total: X
- Kept: Y
- Archived: Z
- Escalated for review: W
See ORPHAN_REVIEW.md for full analysis.
## Validation
- [x] All tests pass after fixes
- [x] Linter reports no module-not-found errors
- [x] Dead links documented in report
Expected: Report saved to REFERENCE_REPAIR_REPORT.md
On failure: (N/A — generate report regardless)
After repairs:
git mv for any moves)Automatic URL Fixes Break Context: Replacing dead links with web.archive.org URLs may not be what the author intended. Some links are better removed.
Over-Aggressive Orphan Deletion: Entry points, CLI scripts, and templates are often unreferenced by design. Don't delete without review.
Import Path Assumptions: Assuming all relative imports use the same base path. Different module systems (CommonJS, ES6, TypeScript) handle paths differently.
External URL False Positives: Some sites block curl/bots but work fine in browsers. Always manually verify dead URLs.
Circular Reference Traps: File A imports B, B imports A. Updating one breaks the other. Requires simultaneous fix.
Ignoring Fragment Identifiers: Fixing [link](#section) requires checking if #section anchor exists, not just if file exists.
Wrong R binary on hybrid systems: On WSL or Docker, Rscript may resolve to a cross-platform wrapper instead of native R. Check with which Rscript && Rscript --version. Prefer the native R binary (e.g., /usr/local/bin/Rscript on Linux/WSL) for reliability. See Setting Up Your Environment for R path configuration.