This skill should be used when the user asks to "search documents", "find in my files", "look for documents about", "semantic search", "find related documents", or when searching project files by meaning rather than exact text. Provides hybrid search combining grep (literal) and embeddings (semantic) for comprehensive document discovery.
npx claudepluginhub the-focus-ai/claude-marketplace --plugin embeddings-search-skillThis skill uses the workspace's default tool permissions.
Hybrid document search combining literal text matching (grep) with semantic similarity (embeddings) for comprehensive search across project documents.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Hybrid document search combining literal text matching (grep) with semantic similarity (embeddings) for comprehensive search across project documents.
Activate this skill when:
The search tool runs both grep and semantic search simultaneously, merging results:
Before searching, ensure documents are indexed. The tool stores a unified index in .embeddings/index.json.
Run the indexer to create embeddings for all documents:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index .
Choose a chunking strategy based on your use case:
# Sliding window (default) - best for precise retrieval
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index . --strategy sliding
# Paragraph-based - best for structured documents
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index . --strategy paragraph
# Sentence-based - best for natural language queries
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index . --strategy sentence
For PDFs, Word docs, and presentations, extract text first by creating a .txt sidecar:
report.pdf → report.pdf.txt (extracted text)
slides.pptx → slides.pptx.txt (extracted text)
Use an LLM to extract text from binary documents before indexing.
Run hybrid search with a query:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "budget concerns"
Filter by metadata extracted from documents:
# Filter by collection (derived from directory names)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "love" --collection "jane austin"
# Filter by document type
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "proposal" --type report
# Filter by author
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "meeting" --author "John Smith"
# Filter by topic
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "conflict" --topic marriage
# Filter by date range
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "quarterly review" --after 2024-01-01 --before 2024-04-01
# Combine multiple filters
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "proposal" --type report --after 2024-01-01
Results include:
grep (literal match) or semantic (meaning-based)Documents found by both grep AND semantic search receive boosted scores.
View all discovered collections, document types, authors, and topics:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts taxonomy .
List indexed files matching filters (without searching):
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts files --collection "jane austin" --type novel
View index information:
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts stats .
Index is stored in .embeddings/index.json:
project/
├── .embeddings/
│ └── index.json # Unified index with chunks and metadata
├── report.pdf
├── report.pdf.txt # Pre-extracted text (for binary files)
└── notes.md
The index contains:
When a user asks "Find documents about the Q4 budget review":
.embeddings/index.json in the project# Step 1: Index (if needed)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index .
# Step 2: Search
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search "Q4 budget review"
Directly indexed (text readable):
.txt - Plain text.md - MarkdownRequires text extraction first:
.pdf - PDF documents.doc, .docx - Word documents.ppt, .pptx - PowerPoint presentationsWhen searching a directory without an existing index:
.embeddings/index.jsonThis ensures seamless first-use experience.
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts index [path] [options]
Index documents in directory (default: current directory)
Options:
--strategy <type> Chunking: sliding (default), paragraph, sentence
--window <size> Window size in chars (default: 300)
--overlap <size> Overlap in chars (default: 100)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts search <query> [path] [options]
Search indexed documents
Options:
--collection <name> Filter by collection
--type <type> Filter by document type
--author <name> Filter by author
--topic <topic> Filter by topic
--path <pattern> Filter by file path pattern
--after DATE Only documents modified after this date
--before DATE Only documents modified before this date
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts taxonomy [path]
Show discovered taxonomy (collections, types, authors, topics)
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts files [path] [options]
List files matching filters
npx tsx ${CLAUDE_PLUGIN_ROOT}/src/cli.ts stats [path]
Show index statistics