Use this skill when ingesting or querying documents with MCP local RAG, including `query_documents`, `ingest_file`, `ingest_data`, and CLI bulk ingestion. Covers query refinement, result score interpretation, and source metadata conventions for PDF, HTML, DOCX, TXT, and Markdown. Not for general file operations or SQL/database queries.
From rootnpx claudepluginhub brandcast-signage/root --plugin rootThis skill uses the workspace's default tool permissions.
references/cli-ingest.mdreferences/html-ingestion.mdreferences/query-optimization.mdreferences/result-refinement.mdSearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides agentic engineering workflows: eval-first loops, 15-min task decomposition, model routing (Haiku/Sonnet/Opus), AI code reviews, and cost tracking.
| Tool | Use When |
|---|---|
ingest_file | Local files (PDF, DOCX, TXT, MD) |
ingest_data | Raw content (HTML, text) with source URL |
query_documents | Semantic + keyword hybrid search |
delete_file / list_files / status | Management |
npx mcp-local-rag ingest | Multiple files or directory (shell) |
Hybrid search combines vector (semantic) and keyword (BM25).
Lower = better match. Use this to filter noise.
| Score | Action |
|---|---|
| < 0.3 | Use directly |
| 0.3-0.5 | Include if mentions same concept/entity |
| > 0.5 | Skip unless no better results |
| Intent | Limit |
|---|---|
| Specific answer (function, error) | 5 |
| General understanding | 10 |
| Comprehensive survey | 20 |
| Situation | Why Transform | Action |
|---|---|---|
| Specific term mentioned | Keyword search needs exact match | KEEP term |
| Vague query | Vector search needs semantic signal | ADD context |
| Error stack or code block | Long text dilutes relevance | EXTRACT core keywords |
| Multiple distinct topics | Single query conflates results | SPLIT queries |
| Few/poor results | Term mismatch | EXPAND (see below) |
When results are few or all score > 0.5, expand query terms:
"config" → "config configuration settings configure"Avoid over-expansion (causes topic drift).
When to include vs skip—based on answer quality, not just score.
INCLUDE if:
SKIP if:
Each result includes fileTitle (document title extracted from content). Null when extraction fails.
| Use | How |
|---|---|
| Disambiguate chunks | Use fileTitle to identify which document the chunk belongs to |
| Group related chunks | Same fileTitle = same document context |
| Deprioritize mismatches | fileTitle unrelated to query AND score > 0.5 → rank lower |
ingest_file({ filePath: "/absolute/path/to/document.pdf" })
ingest_data({
content: "<html>...</html>",
metadata: { source: "https://example.com/page", format: "html" }
})
Format selection — match the data you have:
format: "html"format: "markdown"format: "text"Source format:
https://example.com/page{type}://{date} or {type}://{date}/{detail}
clipboard://2024-12-30, chat://2024-12-30/project-discussionHTML source options:
Re-ingest same source to update. Use same source in delete_file to remove.
For multiple files or directory ingestion. Prefer over repeated ingest_file calls.
| Scenario | Use |
|---|---|
| Single file from user request | ingest_file |
| Multiple files or a directory | npx mcp-local-rag ingest <path> |
| Raw HTML/text content | ingest_data (CLI does not support stdin) |
npx mcp-local-rag ingest [options] <path>
<path>: file or directory (recursively scans supported formats)--help for all options and defaultsOutput interpretation:
SKIPPED (0 chunks): file was empty or too short, counted as successFor edge cases and examples: