QMD - Query Markup Documents
An on-device search engine for everything you need to remember. Index your markdown notes, meeting transcripts, documentation, and knowledge bases. Search with keywords or natural language. Ideal for your agentic flows.
QMD combines BM25 full-text search, vector semantic search, and LLM re-ranking—all running locally via node-llama-cpp with GGUF models.

You can read more about QMD's progress in the CHANGELOG.
Quick Start
# Install globally (Node or Bun)
npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmd
# Or run directly
npx @tobilu/qmd ...
bunx @tobilu/qmd ...
# Create collections for your notes, docs, and meeting transcripts
qmd collection add ~/notes --name notes
qmd collection add ~/Documents/meetings --name meetings
qmd collection add ~/work/docs --name docs
# Add context to help with search results, each piece of context will be returned when matching sub documents are returned. This works as a tree. This is the key feature of QMD as it allows LLMs to make much better contextual choices when selecting documents. Don't sleep on it!
qmd context add qmd://notes "Personal notes and ideas"
qmd context add qmd://meetings "Meeting transcripts and notes"
qmd context add qmd://docs "Work documentation"
# Generate embeddings for semantic search
qmd embed
# Search across everything
qmd search "project timeline" # Fast keyword search
qmd vsearch "how to deploy" # Semantic search
qmd query "quarterly planning process" # Hybrid + reranking (best quality)
# Get a specific document
qmd get "meetings/2024-01-15.md"
# Get a document by docid (shown in search results)
qmd get "#abc123"
# Get multiple documents by glob pattern
qmd multi-get "journals/2025-05*.md"
# Search within a specific collection
qmd search "API" -c notes
# Export all matches for an agent
qmd search "API" --all --files --min-score 0.3
Using with AI Agents
QMD's --json and --files output formats are designed for agentic workflows:
# Get structured results for an LLM
qmd search "authentication" --json -n 10
# List all relevant files above a threshold
qmd query "error handling" --all --files --min-score 0.4
# Retrieve full document content
qmd get "docs/api-reference.md" --full
MCP Server
Although the tool works perfectly fine when you just tell your agent to use it on the command line, it also exposes an MCP (Model Context Protocol) server for tighter integration.
Tools exposed:
query — Search with typed sub-queries (lex/vec/hyde), combined via RRF + reranking
get — Retrieve a document by path or docid (with fuzzy matching suggestions)
multi_get — Batch retrieve by glob pattern, comma-separated list, or docids
status — Index health and collection info
Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"qmd": {
"command": "qmd",
"args": ["mcp"]
}
}
}
Claude Code — Install the plugin (recommended):
claude plugin marketplace add tobi/qmd
claude plugin install qmd@qmd
Or configure MCP manually in ~/.claude/settings.json:
{
"mcpServers": {
"qmd": {
"command": "qmd",
"args": ["mcp"]
}
}
}
HTTP Transport
By default, QMD's MCP server uses stdio (launched as a subprocess by each client). For a shared, long-lived server that avoids repeated model loading, use the HTTP transport:
# Foreground (Ctrl-C to stop)
qmd mcp --http # localhost:8181
qmd mcp --http --port 8080 # custom port
# Background daemon
qmd mcp --http --daemon # start, writes PID to ~/.cache/qmd/mcp.pid
qmd mcp stop # stop via PID file
qmd status # shows "MCP: running (PID ...)" when active
The HTTP server exposes two endpoints:
POST /mcp — MCP Streamable HTTP (JSON responses, stateless)
GET /health — liveness check with uptime
LLM models stay loaded in VRAM across requests. Embedding/reranking contexts are disposed after 5 min idle and transparently recreated on the next request (~1s penalty, models remain loaded).
Point any MCP client at http://localhost:8181/mcp to connect.
SDK / Library Usage
Use QMD as a library in your own Node.js or Bun applications.
Installation
npm install @tobilu/qmd
Quick Start
import { createStore } from '@tobilu/qmd'
const store = await createStore({
dbPath: './my-index.sqlite',
config: {
collections: {
docs: { path: '/path/to/docs', pattern: '**/*.md' },
},
},
})
const results = await store.search({ query: "authentication flow" })
console.log(results.map(r => `${r.title} (${Math.round(r.score * 100)}%)`))
await store.close()
Store Creation
createStore() accepts three modes: