Install hallouminate and bootstrap your first LLM-authored per-repo wiki.
Install hallouminate and bootstrap the user's first LLM-authored per-repo wiki. Use when the user runs /install from the hallouminate skill pack, or asks to "install hallouminate", "set up hallouminate", or "start a hallouminate wiki". Installs the cargo binary, registers the MCP server, then uses Socratic questioning to decide where and how the wiki lives, scaffolds it under .hallouminate/, indexes it, and commits the result with git.
Fold new knowledge into an existing hallouminate wiki — route each new fact to the page it extends, merge it in, create a page only when genuinely novel, and never blend contradictions. Use when there's source material to absorb or a fact to record — "add this to the wiki", "ingest these docs", "update the wiki with what we learned", "remember this", "record this decision", "/wiki-ingest <path|topic>". An opus root plans dedup/route/merge/contradiction decisions; haiku sub-agents fan out to `ground` each candidate against the corpus and read the target pages. Do NOT use to bootstrap an empty wiki (use wiki-init) or to answer a question (use wiki-query).
Bootstrap a hallouminate wiki from scratch by interviewing the user with Socratic questioning, then writing the first entries. Use when a repo has no wiki yet or the corpus is near-empty — "start a wiki", "bootstrap the knowledge base", "interview me about this project", "set up the wiki", "/wiki-init". An opus root runs a semi-structured interview (one question per turn, behavior-first probes) and plans the page taxonomy; haiku sub-agents fan out to draft the captured topics into one-topic-per-file entries in parallel via `add_markdown`. Do NOT use to answer questions (use wiki-query) or to fold new source docs into an existing wiki (use wiki-ingest).
Answer a question from a hallouminate wiki with grounded, cited detail. Use when the user asks something the wiki should know — "what does the wiki say about X", "how does Y work here", "look it up in the wiki", "/wiki-query", or any factual question about a repo whose knowledge lives in a hallouminate corpus. An opus root plans the search and synthesizes the answer; haiku sub-agents fan out one `ground` search per sub-question and return cited evidence. Every claim in the answer carries a `path:line` citation back to the corpus. Do NOT use to write or update wiki entries (use wiki-ingest) or to bootstrap a new wiki (use wiki-init).
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A markdown corpus indexer for LLMs to build and query their own per-repo
wikis. Hallouminate stores markdown verbatim on disk, embeds it with
fastembed, indexes the embeddings in LanceDB, and exposes a small MCP
surface (add_markdown / read_markdown / delete_markdown / ground)
so an LLM can author and search a per-repo knowledge base without
leaving its agent loop.
The filesystem is the source of truth; LanceDB rows are derived and
refreshed automatically when an LLM writes via add_markdown, or in
bulk via hallouminate index. Code files (.rs, .toml, …) can also
be indexed as text for semantic search, but hallouminate does no
structural analysis — it's a wiki indexer that happens to tolerate
code, not a code intelligence tool.
A long-lived local daemon owns the LanceDB ground directory, per-corpus mutation locks, and config resolution. The CLI and the stdio MCP server both talk to it over a Unix domain socket — one owner, no cross-process LanceDB races.
📖 Full documentation: https://cheeselord.dev/hallouminate/
hallouminate serve starts the stdio MCP server (auto-spawning the daemon if
none is running) — this is what an MCP client launches:
hallouminate serve
From a source checkout, run subcommands through cargo:
cargo run -- serve # stdio MCP server
cargo run -- index # bulk (re)index every configured corpus
cargo run -- ground "how does the daemon work" # CLI semantic search
cargo run -- config show # print the effective merged config
cargo build --release
The binary lands in target/release/hallouminate.
hallouminate serve starts a stdio MCP server. Tools:
ground — semantic search.index — bulk (re)build a corpus index.list_corpora — list every configured corpus.list_files — flat list of relative paths in a corpus.list_tree — the same files grouped into a directory tree, for
progressive disclosure without reading every index.md.add_markdown — write a markdown file under the corpus' first root,
atomic and no-symlink-follow, with auto-reindex of just that file.
Returns advisory lint warnings (empty-destination links, empty mermaid
blocks, heading-level jumps) without blocking or rewriting the content.read_markdown — verbatim UTF-8 file contents. Use before overwriting.delete_markdown — unlink the file and prune its rows from the index.globalize_markdown — copy an entry into the global corpus to share it
across repos.Markdown content is stored verbatim — hallouminate imposes no schema.
Convention for LLM wiki authors: one topic per file, first line # Title,
file stem matches the slug.
The config lives at $XDG_CONFIG_HOME/hallouminate/config.toml
(~/.config/hallouminate/config.toml by default).
hallouminate config init — scaffold a baseline config.hallouminate config show — print the effective merged config for the
current working directory (baseline + repo layer).hallouminate config validate — parse and flag unknown top-level keys.hallouminate config download — pre-fetch the configured embedding model
so the first index doesn't pay the download cost.Dense embeddings are on by default, using the
snowflake/snowflake-arctic-embed-s model. On first index hallouminate
downloads that model and fuses its vector signal with lexical search.
To run lexically only — full-text search + ripgrep + rerank, no embedding
model downloaded (just the tokenizer used for chunking) — set enabled = false
in ~/.config/hallouminate/config.toml:
[embeddings]
enabled = false
Changing the embedding mode (or model) for a ground directory that was already
indexed under a different mode trips the store's mismatch guard on the next
run. Delete the ground directory and re-run hallouminate index to rebuild:
rm -rf ~/.local/share/hallouminate/ground
hallouminate index
Set embeddings.model in your config to one of these (all embed to 384-dim
vectors). Omitting embeddings.model selects the default.
| Model | Notes |
|---|---|
snowflake/snowflake-arctic-embed-s | Default. English, symmetric retrieval. |
BAAI/bge-small-en-v1.5 | English, symmetric retrieval. |
intfloat/multilingual-e5-small | Multilingual, asymmetric retrieval; no quantized variant. |
A Claude Code skill pack ships in this repo under
plugins/hallouminate. It installs hallouminate and
bootstraps your first wiki for you:
No description provided.
SLM-powered semantic hook enforcement for Claude Code — classifies assistant output against YAML rules using local Phi-4-mini inference
Opinionated coding harness plugin scaffold for portable agents and skills.
Mikado execution engine — decomposes goals into dependency graphs, executes as parallel ralph loops
npx claudepluginhub paulnsorensen/hallouminate --plugin hallouminateBuild and maintain an LLM-curated personal knowledge base in your project — Andrej Karpathy's LLM Wiki pattern, designed to scale to thousands of pages without becoming a context bottleneck. Now with an optional compiled graph layer for typed, provenance-backed relationships.
Harness-native ECC plugin for engineering teams - 67 agents, 271 skills, 92 legacy command shims, reusable hooks, rules, MCP conventions, and operator workflows for Claude Code plus adjacent agent harnesses
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Binary reverse engineering, malware analysis, firmware security, and software protection research for authorized security research, CTF competitions, and defensive security
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
v9.44.1 — Patch release for Gemini environment/version detection and qwen auth gating. Run /octo:setup.