From aradotso-trending-skills-37
Builds queryable knowledge graphs from codebases, docs, papers, and images via /graphify in AI coding assistants. Outputs interactive HTML viz, reports, JSON, and cache for exploration and querying.
npx claudepluginhub joshuarweaver/cascade-ai-ml-agents-misc-1 --plugin aradotso-trending-skills-37This skill uses the workspace's default tool permissions.
```markdown
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
---
name: graphify-knowledge-graph
description: Build queryable knowledge graphs from code, docs, papers, and images using AI coding assistant skills
triggers:
- "graphify my codebase"
- "build a knowledge graph"
- "turn my files into a graph"
- "understand this codebase with graphify"
- "run graphify on this folder"
- "query the knowledge graph"
- "install graphify skill"
- "extract relationships from my code"
---
# graphify-knowledge-graph
> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.
graphify turns any folder of code, docs, papers, or images into a queryable knowledge graph. It runs as an AI coding assistant skill — type `/graphify` in Claude Code, Codex, OpenCode, or OpenClaw to extract structure, relationships, and design rationale from your files into an interactive graph you can navigate and query without re-reading raw files.
---
## Install
```bash
pip install graphifyy && graphify install
The PyPI package is
graphifyy; the CLI and skill command remaingraphify.
graphify install # Claude Code (default)
graphify install --platform codex # Codex
graphify install --platform opencode # OpenCode
graphify install --platform claw # OpenClaw
Run once per project so your assistant consults the graph before searching files:
graphify claude install # writes CLAUDE.md section + PreToolUse hook (Claude Code)
graphify codex install # writes AGENTS.md (Codex)
graphify opencode install # writes AGENTS.md (OpenCode)
graphify claw install # writes AGENTS.md (OpenClaw)
Undo with the matching uninstall command:
graphify claude uninstall
mkdir -p ~/.claude/skills/graphify
curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v3/graphify/skill.md \
> ~/.claude/skills/graphify/SKILL.md
Add to ~/.claude/CLAUDE.md:
- **graphify** (`~/.claude/skills/graphify/SKILL.md`) - any input to knowledge graph. Trigger: `/graphify`
When the user types `/graphify`, invoke the Skill tool with `skill: "graphify"` before doing anything else.
# In your AI coding assistant
/graphify . # current directory
/graphify ./src # specific folder
/graphify ./raw --mode deep # aggressive INFERRED edge extraction
/graphify ./raw --no-viz # skip HTML, produce report + JSON only
graphify-out/
├── graph.html # interactive — click nodes, search, filter by community
├── GRAPH_REPORT.md # god nodes, surprising connections, suggested questions
├── graph.json # persistent graph — query later without re-reading files
└── cache/ # SHA256 cache — re-runs only process changed files
/graphify query "what connects attention to the optimizer?"
/graphify query "what connects attention to the optimizer?" --dfs # trace a path
/graphify query "what connects attention to the optimizer?" --budget 1500 # cap tokens
/graphify path "DigestAuth" "Response" # shortest path between two nodes
/graphify explain "SwinTransformer" # expand a single node
| Command | What it does |
|---|---|
/graphify . | Build graph from current directory |
/graphify ./folder | Build from a specific folder |
/graphify ./folder --mode deep | More aggressive INFERRED edge extraction |
/graphify ./folder --update | Re-extract only changed files, merge into existing graph |
/graphify ./folder --cluster-only | Rerun clustering without re-extraction |
/graphify ./folder --watch | Auto-sync as files change (code: instant AST; docs: notifies you) |
/graphify add https://arxiv.org/abs/1706.03762 # fetch a paper, add to graph
/graphify add https://x.com/karpathy/status/... # fetch a tweet
/graphify add https://... --author "Andrej Karpathy" # tag original author
/graphify add https://... --contributor "Your Name" # tag who added it
/graphify query "why does the auth layer depend on redis?"
/graphify query "what implements the retry protocol?" --dfs
/graphify path "Transformer" "AdamW"
/graphify explain "DigestAuth"
/graphify ./folder --svg # export graph.svg
/graphify ./folder --graphml # export graph.graphml (Gephi, yEd)
/graphify ./folder --neo4j # generate cypher.txt for Neo4j import
/graphify ./folder --neo4j-push bolt://localhost:7687 # push to live Neo4j
/graphify ./folder --obsidian # generate Obsidian vault (opt-in)
/graphify ./folder --wiki # build agent-crawlable wiki (index.md + per-community articles)
/graphify ./folder --mcp # start MCP stdio server
graphify hook install # post-commit + post-checkout: auto-rebuild on commit/branch switch
graphify hook uninstall
graphify hook status
| Type | Extensions | Extraction method |
|---|---|---|
| Code | .py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php .swift .lua | AST via tree-sitter + call graph + docstring/comment rationale (no LLM) |
| Docs | .md .txt .rst | Concepts + relationships + design rationale via Claude |
| Papers | .pdf | Citation mining + concept extraction |
| Images | .png .jpg .webp .gif | Claude vision — screenshots, diagrams, any language |
graphify runs in two passes:
Results are merged into a NetworkX graph, clustered with Leiden community detection (topology-based — no embeddings or vector database), and exported as HTML, JSON, and a plain-language audit report.
Every relationship is tagged so you always know what was found vs guessed:
| Tag | Meaning | Confidence |
|---|---|---|
EXTRACTED | Found directly in source | Always 1.0 |
INFERRED | Reasonable inference | 0.0–1.0 score |
AMBIGUOUS | Flagged for human review | — |
graphify is primarily a CLI/skill tool, but the graph output (graph.json) is standard NetworkX JSON you can load and traverse:
import json
import networkx as nx
# Load the persistent graph
with open("graphify-out/graph.json") as f:
data = json.load(f)
G = nx.node_link_graph(data)
# Find god nodes (highest degree)
god_nodes = sorted(G.degree(), key=lambda x: x[1], reverse=True)[:10]
for node, degree in god_nodes:
print(f"{node}: {degree} connections")
# Find all EXTRACTED edges (high confidence, found in source)
extracted_edges = [
(u, v, d) for u, v, d in G.edges(data=True)
if d.get("provenance") == "EXTRACTED"
]
# Find INFERRED edges above a confidence threshold
high_confidence_inferred = [
(u, v, d) for u, v, d in G.edges(data=True)
if d.get("provenance") == "INFERRED" and d.get("confidence_score", 0) > 0.85
]
# Shortest path between two concepts
try:
path = nx.shortest_path(G, source="DigestAuth", target="Response")
print(" -> ".join(path))
except nx.NetworkXNoPath:
print("No path found")
# Get all nodes in a community
communities = {}
for node, data in G.nodes(data=True):
community_id = data.get("community")
if community_id is not None:
communities.setdefault(community_id, []).append(node)
for cid, members in sorted(communities.items()):
print(f"Community {cid}: {', '.join(members[:5])}{'...' if len(members) > 5 else ''}")
graphify extracts # NOTE:, # IMPORTANT:, # HACK:, # WHY: comments and docstrings as rationale_for nodes:
# Find all rationale nodes and what they explain
rationale_nodes = [
(node, data) for node, data in G.nodes(data=True)
if data.get("node_type") == "rationale_for"
]
for node, data in rationale_nodes:
print(f"Rationale: {data.get('label')}")
# Find what this rationale is connected to
neighbors = list(G.neighbors(node))
print(f" Explains: {neighbors}")
# Find cross-file semantic links (concepts connected without structural relationship)
semantic_edges = [
(u, v, d) for u, v, d in G.edges(data=True)
if d.get("relation") == "semantically_similar_to"
]
for u, v, data in semantic_edges:
score = data.get("confidence_score", 0)
print(f"{u} ~ {v} (confidence: {score:.2f})")
# Install graphify, build the graph, read the report
pip install graphifyy && graphify install
# In Claude Code
/graphify .
# Read the output — god nodes tell you what everything routes through
cat graphify-out/GRAPH_REPORT.md
/raw folder)Drop code, PDFs, screenshots, and notes in one folder:
raw/
├── attention_is_all_you_need.pdf
├── training_notes.md
├── whiteboard_photo.png
├── nanoGPT/
│ └── model.py
└── tweet_screenshot.jpg
/graphify ./raw
graphify uses Claude vision on images, citation mining on PDFs, AST on code, and semantic extraction on markdown — all merged into one graph.
# First full build
/graphify ./src
# After making changes — only re-processes changed files via SHA256 cache
/graphify ./src --update
# After a major refactor — rerun clustering without re-extracting
/graphify ./src --cluster-only
# Terminal 1: keep graph in sync as you code
/graphify ./src --watch
# Terminal 2: your normal development
# Code saves → instant AST rebuild
# Doc/image saves → graphify notifies you to run --update for LLM re-pass
# Neo4j (generate Cypher, then push)
/graphify ./src --neo4j
# cypher.txt is written to graphify-out/
/graphify ./src --neo4j-push bolt://localhost:7687
# Gephi / yEd
/graphify ./src --graphml
# Obsidian vault
/graphify ./src --obsidian
# Agent-crawlable wiki
/graphify ./src --wiki
# graphify-out/wiki/index.md is the entry point
# Auto-rebuild graph on every commit and branch switch
graphify hook install
# Verify hooks are active
graphify hook status
# Start an MCP stdio server so any MCP-compatible client can query the graph
/graphify ./src --mcp
The report has four sections:
## God Nodes
Highest-degree concepts — what everything connects through.
These are your architectural load-bearing walls.
## Surprising Connections
Cross-domain edges ranked by composite score.
Code-paper edges rank higher than code-code.
Each result includes a plain-English why.
## Suggested Questions
4-5 questions the graph is uniquely positioned to answer.
Start here when exploring an unfamiliar corpus.
## Token Benchmark
Printed after every run.
First run: extracts and builds (costs tokens).
Subsequent queries: read compact graph.json instead of raw files.
SHA256 cache means re-runs only re-process changed files.
graphify: command not found after pip install# Check your PATH includes pip's script directory
python -m graphify install
# Or use the full path
python -m pip show graphifyy | grep Location
# then add {Location}/../Scripts to PATH (Windows) or {Location}/../bin (Unix)
Verify install wrote the skill file:
ls ~/.claude/skills/graphify/SKILL.md
Check ~/.claude/CLAUDE.md contains the graphify entry. If missing, re-run:
graphify install
Parallel subagents require the AI assistant to support multi-agent mode:
multi_agent = true under [features] in ~/.codex/config.tomlThe corpus may be too small (< ~6 files). At small scale, graphify still works but token reduction is minimal — the value is structural clarity, not compression. Try --mode deep for more aggressive INFERRED edge extraction:
/graphify ./src --mode deep
The SHA256 cache lives in graphify-out/cache/. If it's missing or the output folder was deleted, graphify re-processes everything. This is expected. Subsequent runs use the cache.
ls graphify-out/cache/ # verify cache exists after first run
# Verify Neo4j is running and bolt port is accessible
/graphify ./src --neo4j # generate cypher.txt first
# then manually import in Neo4j Browser:
# :source graphify-out/cypher.txt
# Or push directly (requires neo4j Python driver)
pip install neo4j
/graphify ./src --neo4j-push bolt://localhost:7687
Set credentials via environment variables — do not hardcode:
export NEO4J_USERNAME=neo4j
export NEO4J_PASSWORD=your_password
--watch misses doc/image changesWatch mode gives instant rebuilds for code files (AST only, no LLM). For docs and images, it prints a notification because LLM re-extraction is non-trivial to run on every save. When you see the notification:
/graphify ./src --update # re-processes only changed docs/images
graphify prints a benchmark after every run. Typical results:
| Corpus | Files | Token reduction |
|---|---|---|
| Karpathy repos + 5 papers + 4 images | 52 | 71.5x |
| graphify source + Transformer paper | 4 | 5.4x |
| Small Python library | 6 | ~1x |
Token reduction scales with corpus size. At 52 mixed files, querying graph.json uses 71x fewer tokens than reading raw files. The first run costs tokens to build the graph. Every subsequent query reads the compact graph instead — savings compound across sessions.