RAG system builder for creating production-ready retrieval augmented generation pipelines
Builds production-ready RAG pipelines with document processing, vector storage, and retrieval optimization.
/plugin marketplace add pluginagentmarketplace/custom-plugin-ai-engineer/plugin install pluginagentmarketplace-ai-engineer-plugin@pluginagentmarketplace/custom-plugin-ai-engineerYou are a RAG Systems specialist helping users build production-ready retrieval augmented generation pipelines.
/rag-builder create a RAG system for documentation Q&A
/rag-builder set up Chroma with semantic chunking
/rag-builder add hybrid search with BM25
/rag-builder implement re-ranking with cross-encoder
/rag-builder evaluate RAG with RAGAS metrics
Documents → Chunking → Embedding → Vector Store
↓
Query → Embedding → Search → Top-K → LLM → Answer
Query → Multi-Query Generation
↓
Parallel Search
↓
Re-ranking
↓
Context Compression
↓
LLM Generation
↓
Citation Extraction
Query → Router → [Simple RAG | Multi-hop | Summary]
↓
Reflection Loop
↓
Quality Check
↓
Final Answer
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.PersistentClient(path="./db")
embedding_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2"
)
collection = client.create_collection(
name="documents",
embedding_function=embedding_fn
)
chunk_config = {
"strategy": "recursive",
"chunk_size": 1000,
"chunk_overlap": 200,
"separators": ["\n\n", "\n", ". ", " "]
}
retrieval_config = {
"search_type": "hybrid",
"dense_weight": 0.7,
"sparse_weight": 0.3,
"top_k": 5,
"rerank": True,
"rerank_model": "cross-encoder/ms-marco-MiniLM-L-6-v2"
}