Hybrid search combining semantic and keyword retrieval for RAG pipelines. Implement BM25 + dense vector search with fusion strategies.
Implements hybrid semantic and keyword search for improved RAG retrieval accuracy.
npx claudepluginhub a5c-ai/babysitterThis skill is limited to using the following tools:
README.mdImplement hybrid search combining semantic vector retrieval with keyword-based BM25 search for improved RAG pipeline accuracy and recall.
Hybrid search addresses the limitations of pure semantic or pure keyword search:
from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import Chroma
from langchain.retrievers import EnsembleRetriever
from langchain_openai import OpenAIEmbeddings
# Create documents
docs = [...] # Your document chunks
# Dense retriever (semantic)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(docs, embeddings)
dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
# Sparse retriever (BM25)
bm25_retriever = BM25Retriever.from_documents(docs)
bm25_retriever.k = 5
# Hybrid ensemble
hybrid_retriever = EnsembleRetriever(
retrievers=[bm25_retriever, dense_retriever],
weights=[0.4, 0.6] # Adjust based on use case
)
# Query
results = hybrid_retriever.invoke("How do I configure the system?")
def reciprocal_rank_fusion(results_lists: list, k: int = 60) -> list:
"""
Combine multiple ranked lists using RRF.
k is a constant (typically 60) for smoothing.
"""
fused_scores = {}
for results in results_lists:
for rank, doc in enumerate(results):
doc_id = doc.metadata.get("id", str(doc.page_content[:50]))
if doc_id not in fused_scores:
fused_scores[doc_id] = {"doc": doc, "score": 0}
fused_scores[doc_id]["score"] += 1 / (k + rank + 1)
# Sort by fused score
sorted_docs = sorted(
fused_scores.values(),
key=lambda x: x["score"],
reverse=True
)
return [item["doc"] for item in sorted_docs]
# Use with multiple retrievers
semantic_results = dense_retriever.invoke(query)
keyword_results = bm25_retriever.invoke(query)
hybrid_results = reciprocal_rank_fusion([semantic_results, keyword_results])
from pinecone import Pinecone
from pinecone_text.sparse import BM25Encoder
# Initialize Pinecone
pc = Pinecone(api_key="your-api-key")
index = pc.Index("hybrid-index")
# Prepare sparse encoder
bm25 = BM25Encoder()
bm25.fit(corpus) # Fit on your document corpus
def hybrid_query(query: str, alpha: float = 0.5, top_k: int = 10):
"""
Query with hybrid search.
alpha: weight for dense vectors (1-alpha for sparse)
"""
# Get dense embedding
dense_embedding = embeddings.embed_query(query)
# Get sparse embedding
sparse_embedding = bm25.encode_queries([query])[0]
# Hybrid query
results = index.query(
vector=dense_embedding,
sparse_vector=sparse_embedding,
top_k=top_k,
include_metadata=True
)
return results
import weaviate
client = weaviate.Client("http://localhost:8080")
def weaviate_hybrid_search(query: str, alpha: float = 0.5, limit: int = 10):
"""
Weaviate native hybrid search.
alpha: 0 = pure BM25, 1 = pure vector
"""
result = (
client.query
.get("Document", ["content", "title", "metadata"])
.with_hybrid(
query=query,
alpha=alpha,
properties=["content", "title"]
)
.with_limit(limit)
.do()
)
return result["data"]["Get"]["Document"]
const ragHybridSearchTask = defineTask({
name: 'rag-hybrid-search-setup',
description: 'Configure hybrid search for RAG pipeline',
inputs: {
vectorStore: { type: 'string', required: true }, // 'pinecone', 'weaviate', 'chroma', etc.
embeddingModel: { type: 'string', default: 'text-embedding-3-small' },
bm25Params: { type: 'object', default: { k1: 1.5, b: 0.75 } },
fusionStrategy: { type: 'string', default: 'rrf' }, // 'rrf', 'weighted', 'custom'
denseWeight: { type: 'number', default: 0.6 },
topK: { type: 'number', default: 10 }
},
outputs: {
retrieverConfigured: { type: 'boolean' },
indexStats: { type: 'object' },
artifacts: { type: 'array' }
},
async run(inputs, taskCtx) {
return {
kind: 'skill',
title: `Configure hybrid search with ${inputs.vectorStore}`,
skill: {
name: 'rag-hybrid-search',
context: {
vectorStore: inputs.vectorStore,
embeddingModel: inputs.embeddingModel,
bm25Params: inputs.bm25Params,
fusionStrategy: inputs.fusionStrategy,
denseWeight: inputs.denseWeight,
topK: inputs.topK,
instructions: [
'Validate vector store connection and configuration',
'Set up dense embedding pipeline',
'Configure BM25/sparse encoding',
'Implement fusion strategy',
'Test retrieval quality with sample queries',
'Document configuration and tuning parameters'
]
}
},
io: {
inputJsonPath: `tasks/${taskCtx.effectId}/input.json`,
outputJsonPath: `tasks/${taskCtx.effectId}/result.json`
}
};
}
});
Activates when the user asks about AI prompts, needs prompt templates, wants to search for prompts, or mentions prompts.chat. Use for discovering, retrieving, and improving prompts.
Search, retrieve, and install Agent Skills from the prompts.chat registry using MCP tools. Use when the user asks to find skills, browse skill catalogs, install a skill for Claude, or extend Claude's capabilities with reusable AI agent components.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.