🤔 Why MinerU Document Explorer?
MinerU Document Explorer equips your agent with three tool suites — Retrieve, Deep Read, and Ingest — closing the full knowledge loop:

- 🔍 Retrieve — Cross-collection search: BM25, vector, and hybrid with LLM reranking and query expansion
- 📖 Deep Read — Navigate inside a single document without loading the whole file: table of contents, section reading, inline search, and element extraction
- 📝 Ingest — Build and maintain a LLM wiki from raw documents, following the Karpathy LLM Wiki pattern
Developed by the MinerU team, building on QMD and Karpathy's LLM Wiki.
🚀 Quick Start
Agent-Assisted Setup: If you're using an AI agent (Claude Code, Cursor, etc.), simply ask it to help you deploy MinerU Document Explorer and install skills — the agent can handle the entire setup process for you, including MCP configuration.
Follow the https://github.com/opendatalab/MinerU-Document-Explorer/blob/main/docs/quickstart.md to install MinerU Document Explorer and walk the user through configuration.
📖 Document Deep Reading
Navigate and search within a single document without reading the whole file:
# View document structure
qmd doc-toc papers/attention-is-all-you-need.pdf
# Read specific sections by address
qmd doc-read papers/attention-is-all-you-need.pdf "line:45-120"
# Search within one document
qmd doc-grep papers/attention-is-all-you-need.pdf "self-attention"
🔌 MCP Server — 15 Tools for AI Agents
Integrate with AI agents via Model Context Protocol.
MCP Server vs CLI: The MCP server runs as a persistent process — LLM models (embeddings, reranker, query expansion) are loaded once and stay in memory across requests. CLI commands like qmd query must reload all models on every invocation, adding ~5–15 s of startup overhead each time. For agent workflows, always prefer the MCP server.
Two transport modes:
| Mode | Command | Best for |
|---|
| stdio | qmd mcp | Claude Desktop, Claude Code — client spawns and manages the process |
| HTTP daemon | qmd mcp --http --daemon | Cursor, Windsurf, VS Code, multi-client setups — one shared persistent server |
# Start the HTTP daemon (recommended — models stay loaded across all requests)
qmd mcp --http --daemon # default port 8181
qmd mcp --http --daemon --port 8080 # custom port
# Verify server is running
curl http://localhost:8181/health
# Stop the daemon
qmd mcp stop
Client Configuration
Cursor — add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global)
Option A — stdio (Cursor manages the process lifecycle):
{
"mcpServers": {
"qmd": {
"command": "qmd",
"args": ["mcp"]
}
}
}
Option B — HTTP (run qmd mcp --http --daemon first; models stay loaded, faster responses):
{
"mcpServers": {
"qmd": {
"url": "http://localhost:8181/mcp"
}
}
}
Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"qmd": {
"command": "qmd",
"args": ["mcp"]
}
}
}
Claude Code — add to ~/.claude/settings.json or run claude mcp add qmd -- qmd mcp