This skill should be used when the user asks to "search documents", "find in my files", "look for documents about", "semantic search", "find related documents", or when searching project files by meaning rather than exact text. Provides hybrid search combining grep (literal) and embeddings (semantic) for comprehensive document discovery.
How this skill is triggered — by the user, by Claude, or both
Slash command
/embeddings-search-skill:document-searchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Hybrid document search combining literal text matching (grep) with semantic similarity (embeddings) for comprehensive search across project documents.
Hybrid document search combining literal text matching (grep) with semantic similarity (embeddings) for comprehensive search across project documents.
Activate this skill when:
The search tool runs both grep and semantic search simultaneously, merging results:
Before searching, ensure documents are indexed. The tool stores a unified index in .embeddings/index.json.
Run the indexer to create embeddings for all documents:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index .
Choose a chunking strategy based on your use case:
# Sliding window (default) - best for precise retrieval
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index . --strategy sliding
# Paragraph-based - best for structured documents
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index . --strategy paragraph
# Sentence-based - best for natural language queries
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index . --strategy sentence
For PDFs, Word docs, and presentations, extract text first by creating a .txt sidecar:
report.pdf → report.pdf.txt (extracted text)
slides.pptx → slides.pptx.txt (extracted text)
Use an LLM to extract text from binary documents before indexing.
Run hybrid search with a query:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "budget concerns"
Filter by metadata extracted from documents:
# Filter by collection (derived from directory names)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "love" --collection "jane austin"
# Filter by document type
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "proposal" --type report
# Filter by author
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "meeting" --author "John Smith"
# Filter by topic
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "conflict" --topic marriage
# Filter by date range
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "quarterly review" --after 2024-01-01 --before 2024-04-01
# Combine multiple filters
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "proposal" --type report --after 2024-01-01
Results include:
grep (literal match) or semantic (meaning-based)Documents found by both grep AND semantic search receive boosted scores.
View all discovered collections, document types, authors, and topics:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts taxonomy .
List indexed files matching filters (without searching):
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts files --collection "jane austin" --type novel
View index information:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts stats .
Index is stored in .embeddings/index.json:
project/
├── .embeddings/
│ └── index.json # Unified index with chunks and metadata
├── report.pdf
├── report.pdf.txt # Pre-extracted text (for binary files)
└── notes.md
The index contains:
When a user asks "Find documents about the Q4 budget review":
.embeddings/index.json in the project# Step 1: Index (if needed)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index .
# Step 2: Search
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "Q4 budget review"
Directly indexed (text readable):
.txt - Plain text.md - MarkdownRequires text extraction first:
.pdf - PDF documents.doc, .docx - Word documents.ppt, .pptx - PowerPoint presentationsWhen searching a directory without an existing index:
.embeddings/index.jsonThis ensures seamless first-use experience.
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index [path] [options]
Index documents in directory (default: current directory)
Options:
--strategy <type> Chunking: sliding (default), paragraph, sentence
--window <size> Window size in chars (default: 300)
--overlap <size> Overlap in chars (default: 100)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search <query> [path] [options]
Search indexed documents
Options:
--collection <name> Filter by collection
--type <type> Filter by document type
--author <name> Filter by author
--topic <topic> Filter by topic
--path <pattern> Filter by file path pattern
--after DATE Only documents modified after this date
--before DATE Only documents modified before this date
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts taxonomy [path]
Show discovered taxonomy (collections, types, authors, topics)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts files [path] [options]
List files matching filters
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts stats [path]
Show index statistics
npx claudepluginhub the-focus-ai/claude-marketplace --plugin embeddings-search-skillSearches indexed local document folders using natural language queries on Markdown/text files. Activates for file content questions, 'find document about...', or indexing requests.
Performs local keyword, semantic, or hybrid search on markdown notes and docs to find relevant files before reading them, saving 90% tokens during codebase exploration.
Index and search document collections using hybrid semantic, graph, and full-text search. Use for knowledge bases, finding connections between documents, or querying markdown collections.