From agent-knowledge
Ingest or update a codebase in the agent-knowledge base. First run bootstraps the knowledge base from scratch; subsequent runs are incremental (only changed/new/deleted files reprocessed). Uses tree-sitter for zero-token structural extraction. Trigger on "/knowledge-ingest", "ingest this codebase", "load this into knowledge", "scan this project", "index this repo", "update knowledge", "refresh knowledge", "re-ingest".
npx claudepluginhub keshrath/agent-knowledgeThis skill uses the workspace's default tool permissions.
Populate or update agent-knowledge from a codebase. Tree-sitter extracts structure (zero LLM tokens), then the agent distills clusters into knowledge entries + graph edges via existing MCP tools.
Ingests codebase markdown documentation, builds internal knowledge graphs, and prepares optimized context representations for analysis and planning tasks. Supports progressive or full modes.
Harvests knowledge from external sources like sibling repos, local directories, files, or web URLs into the project's KB system with provenance tracking.
Analyzes codebase to produce knowledge-graph.json for interactive dashboard exploring architecture, components, and relationships
Share bugs, ideas, or general feedback.
Populate or update agent-knowledge from a codebase. Tree-sitter extracts structure (zero LLM tokens), then the agent distills clusters into knowledge entries + graph edges via existing MCP tools.
First run: full ingest — scans all files, creates entries from scratch.
Subsequent runs: incremental — only reprocesses files whose SHA256 changed, adds entries for new files, removes entries for deleted files. The .knowledge-ingest-cache.json file in the target directory tracks state between runs.
package.json → name fieldCargo.toml → [package] namego.mod → module linepyproject.toml → [project] name.knowledge-ingest-cache.json in the target directory. If found, load it — this is an incremental run. Report how many files changed since last ingest.node "<agent-knowledge-repo>/scripts/tree-sitter-extract.mjs" "<target-path>" --exclude "node_modules,dist,.git,vendor,__pycache__,build,target,.venv,coverage" --json
To find <agent-knowledge-repo>, check common locations:
~/.claude/mcp-servers/agent-knowledge/dirname $(which agent-knowledge)/../Capture the JSON output. It contains per-file: symbols (classes, functions, methods), imports, exports, rationale comments (WHY/NOTE/DECISION/HACK/TODO/FIXME), call edges, SHA256 hashes, and a dependency graph.
If the script fails (missing dependency, path error), report the error to the user and offer to fall back to manual file reading for a subset of key files.
From the dependency graph and directory structure, group files into subsystems:
Identify structural highlights:
knowledge({ action: "write", category: "projects", filename: "<project-name>", content: "---\ntitle: <Project Name>\ntags: [auto-ingested, <primary-language>]\nupdated: <today>\nconfidence: inferred\nconfidence_score: 0.8\n---\n\n# <Project Name>\n\n## Tech Stack\n...\n\n## Architecture\n...\n\n## Entry Points\n...\n\n## Subsystems\n- <cluster-1>: <one-line description>\n- <cluster-2>: <one-line description>\n..." })
Include: project name, tech stack (languages, frameworks, key dependencies), directory structure overview, entry points, subsystem list with one-line descriptions.
knowledge({ action: "write", category: "notes", filename: "<project>-<cluster-name>", content: "---\ntitle: <Project> — <Cluster Name>\ntags: [auto-ingested, subsystem, <language>]\nupdated: <today>\nconfidence: inferred\nconfidence_score: 0.75\n---\n\n## Purpose\n<inferred from symbol names and structure>\n\n## Key Symbols\n- `ClassName` (line N) — <from docstring or inferred>\n- `functionName(params)` (line N)\n\n## Dependencies\nImports from: <other clusters>\nImported by: <other clusters>\n\n## Rationale\n<any WHY/NOTE/DECISION comments found in this cluster>" })
Keep each entry under 300 words. Focus on structure and relationships, not implementation details.
knowledge({ action: "write", category: "decisions", filename: "<project>-<slug>", content: "---\ntitle: <Decision summary>\ntags: [auto-ingested, rationale]\nupdated: <today>\nconfidence: extracted\nconfidence_score: 1.0\n---\n\n## Decision\n<the rationale comment text>\n\n## Context\nFile: `<file-path>`, line <N>\nSymbol: `<enclosing function/class>`\n\n## Related\n<other rationale comments or subsystems this connects to>" })
Only create decision entries for substantive rationale (WHY, DECISION, SAFETY). Skip generic TODOs and FIXMEs unless they contain real architectural reasoning.
.github/workflows/, Makefile, Dockerfile, docker-compose.yml, Jenkinsfile, .gitlab-ci.yml), read them directly and create workflow entries:knowledge({ action: "write", category: "workflows", filename: "<project>-<workflow>", content: "..." })
Summarize: what the workflow does, triggers, key steps, deployment targets.
part_of edges from each subsystem to the project:knowledge_graph({ action: "link", source: "notes/<project>-<cluster>.md", target: "projects/<project>.md", rel_type: "part_of", strength: 0.9, origin: "ingest" })
depends_on edges between subsystems based on the import dependency graph:knowledge_graph({ action: "link", source: "notes/<project>-<cluster-a>.md", target: "notes/<project>-<cluster-b>.md", rel_type: "depends_on", strength: 0.8, origin: "ingest" })
builds_on edges from decisions to the subsystem they relate to:knowledge_graph({ action: "link", source: "decisions/<project>-<decision>.md", target: "notes/<project>-<cluster>.md", rel_type: "builds_on", strength: 0.7, origin: "ingest" })
related_to edges — auto-linking fires on every knowledge write and handles cross-entry similarity.If PDF files exist in the project (root, docs/, papers/):
notes/<project>-<pdf-name>.mdIf architecture diagram images exist (.png, .svg, .jpg in docs/, architecture/, diagrams/):
notes/<project>-<diagram-name>.mdIf URLs are provided by the user:
notes/Skip this phase entirely if no such files exist. Do not search exhaustively for media files.
.knowledge-ingest-cache.json to the target directory:{
"version": 1,
"timestamp": "<ISO date>",
"project": "<project-name>",
"agent_knowledge_version": "<version>",
"files": {
"src/foo.ts": {
"sha256": "abc123...",
"entries": ["notes/<project>-foo-module.md"]
}
},
"entries_created": [
"projects/<project>.md",
"notes/<project>-core.md",
"decisions/<project>-auth-design.md"
]
}
knowledge({ action: "delete" })node "<agent-knowledge-repo>/skills/knowledge-ingest/scripts/validate.mjs" "<target-path>"
status is FAIL:
issues arrayIngested <project-name>:
Files scanned: N
Clusters identified: N
Entries created:
- projects/: 1
- notes/: N (subsystems)
- decisions/: N (rationale)
- workflows/: N (CI/build)
Graph edges: N
Skipped (cached): N files unchanged
related_to edges manually — auto-linking handles this on every write.knowledge write calls in a single run — batch small clusters together if needed./knowledge-ingest .
/knowledge-ingest ~/projects/my-api
/knowledge-ingest ./libs/auth --exclude "test,mock"