code-graph-mcp
A high-performance code knowledge graph server implementing the Model Context Protocol (MCP). Indexes codebases into a structured AST knowledge graph with semantic search, call graph traversal, and HTTP route tracing — designed to give AI coding assistants deep, structured understanding of your code.
Features
- Multi-language parsing — Tree-sitter AST extraction for 16 languages: TypeScript, JavaScript, Go, Python, Rust, Java, C, C++, C#, Kotlin, Ruby, PHP, Swift, Dart, HTML, CSS
- Semantic code search — Hybrid BM25 full-text + vector semantic search with Reciprocal Rank Fusion (RRF), powered by sqlite-vec
- Call graph traversal — Recursive CTE queries to trace callers/callees with cycle detection
- HTTP route tracing — Map route paths to backend handler functions (Express, Flask/FastAPI, Go, ASP.NET, Rails, Laravel, Vapor)
- Dead code detection — Find unreferenced symbols with smart Orphan/Exported-Unused classification
- Impact analysis — Determine the blast radius of code changes by tracing all dependents
- Incremental indexing — Merkle tree change detection with file system watcher for real-time updates. Smart event filtering skips metadata-only changes (chmod, xattr)
- Context compression — Token-aware snippet extraction for LLM context windows (L0→full code, L1→summaries, L2→file groups, L3→directory overview). Compact JSON output saves 15-20% tokens
- Embedding model — Optional local embedding via Candle (feature-gated
embed-model). Context reordered to prioritize structural relations over code for better embedding quality
- Self-healing — Automatic SQLite corruption recovery with rebuild. Startup repair for incomplete indexing (Phase 3 failures)
- MCP protocol — JSON-RPC 2.0 over stdio, plug-and-play with Claude Code, Cursor, Windsurf, and other MCP clients
- Claude Code Plugin — First-class plugin with slash commands (
/understand, /trace, /impact), agents, skills, auto-indexing hooks, StatusLine integration, and self-updating
Why code-graph-mcp?
Unlike naive full-text search or simple AST dumps, code-graph-mcp builds a structured knowledge graph that understands the relationships between symbols across your entire codebase.
Incremental by Design
BLAKE3 Merkle tree tracks every file's content hash. On re-index, only changed files are re-parsed — unchanged directory subtrees are skipped entirely via mtime cache. When a function signature changes, dirty propagation automatically regenerates context for all downstream callers across files.
Hybrid Search, Not Just Grep
Combines BM25 full-text ranking (FTS5) with vector semantic similarity (sqlite-vec) via Reciprocal Rank Fusion (RRF) with raw score blending — so searching "handle user login" finds the right function even if it's named authenticate_session. Results are auto-compressed to fit LLM context windows.
Scope-Aware Relation Extraction
The parser doesn't just find function calls — it tracks them within their proper scope context. Extracts calls, imports, inheritance, interface implementations, exports, and HTTP route bindings. Same-file targets are preferred over cross-file matches to minimize false-positive edges.
HTTP Request Flow Tracing
Unique to code-graph-mcp: trace from GET /api/users → route handler → service layer → database call in a single query. Supports Express, Flask/FastAPI, and Go HTTP frameworks.
Zero External Dependencies at Runtime
Single binary, embedded SQLite, bundled sqlite-vec extension, optional local embedding model via Candle — no database server, no cloud API, no Docker required. Runs entirely on your machine.
Built for AI Assistants
Every design decision — from token-aware compression to node_id-based snippet expansion — is optimized for LLM context windows. Works out of the box with Claude Code, Cursor, Windsurf, and any MCP-compatible client.
Performance
| Metric | Value |
|---|
| Indexing speed | 300+ files/second (single-threaded, release build) |
| Incremental re-index | <250ms no-change detection via BLAKE3 Merkle tree |
| FTS search P50 / P99 | <300us / <1ms |
| Database overhead | ~3.5MB per 800 nodes |
| Token savings | 5-20x fewer tokens per code understanding task vs grep+read |
Run code-graph-mcp benchmark on your own project to measure.
Efficiency: code-graph vs Traditional Tools
Real-world benchmarks comparing code-graph-mcp tools against traditional approaches (Grep + Read + Glob) on a 33-file Rust project (~537 AST nodes).
Tool Call Reduction