Help us improve
Share bugs, ideas, or general feedback.
From llm-application-dev
Builds Retrieval-Augmented Generation (RAG) systems using vector databases, embeddings, retrieval strategies, and reranking. Use for document Q&A, knowledge-grounded chatbots, or semantic search over proprietary data.
npx claudepluginhub wshobson/agents --plugin llm-application-devHow this skill is triggered — by the user, by Claude, or both
Slash command
/llm-application-dev:rag-implementationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Master Retrieval-Augmented Generation (RAG) to build LLM applications that provide accurate, grounded responses using external knowledge sources.
Build RAG systems for LLM apps using vector databases, embeddings, and retrieval strategies. Use for document Q&A, grounded chatbots, and semantic search.
<!-- AUTO-GENERATED by export-plugins.py — DO NOT EDIT -->
RAG (Retrieval Augmented Generation) implementation patterns including document chunking, embedding generation, vector database integration, semantic search, and RAG pipelines. Use when building RAG systems, implementing semantic search, creating knowledge bases, or when user mentions RAG, embeddings, vector database, retrieval, document chunking, or knowledge retrieval.
Share bugs, ideas, or general feedback.
Master Retrieval-Augmented Generation (RAG) to build LLM applications that provide accurate, grounded responses using external knowledge sources.
Purpose: Store and retrieve document embeddings efficiently
Options:
Purpose: Convert text to numerical vectors for similarity search
Models (2026):
| Model | Dimensions | Best For |
|---|---|---|
| voyage-3-large | 1024 | Claude apps (Anthropic recommended) |
| voyage-code-3 | 1024 | Code search |
| text-embedding-3-large | 3072 | OpenAI apps, high accuracy |
| text-embedding-3-small | 1536 | OpenAI apps, cost-effective |
| bge-large-en-v1.5 | 1024 | Open source, local deployment |
| multilingual-e5-large | 1024 | Multi-language support |
Approaches:
Purpose: Improve retrieval quality by reordering results
Methods:
from langgraph.graph import StateGraph, START, END
from langchain_anthropic import ChatAnthropic
from langchain_voyageai import VoyageAIEmbeddings
from langchain_pinecone import PineconeVectorStore
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain_text_splitters import RecursiveCharacterTextSplitter
from typing import TypedDict, Annotated
class RAGState(TypedDict):
question: str
context: list[Document]
answer: str
# Initialize components
llm = ChatAnthropic(model="claude-sonnet-4-6")
embeddings = VoyageAIEmbeddings(model="voyage-3-large")
vectorstore = PineconeVectorStore(index_name="docs", embedding=embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
# RAG prompt
rag_prompt = ChatPromptTemplate.from_template(
"""Answer based on the context below. If you cannot answer, say so.
Context:
{context}
Question: {question}
Answer:"""
)
async def retrieve(state: RAGState) -> RAGState:
"""Retrieve relevant documents."""
docs = await retriever.ainvoke(state["question"])
return {"context": docs}
async def generate(state: RAGState) -> RAGState:
"""Generate answer from context."""
context_text = "\n\n".join(doc.page_content for doc in state["context"])
messages = rag_prompt.format_messages(
context=context_text,
question=state["question"]
)
response = await llm.ainvoke(messages)
return {"answer": response.content}
# Build RAG graph
builder = StateGraph(RAGState)
builder.add_node("retrieve", retrieve)
builder.add_node("generate", generate)
builder.add_edge(START, "retrieve")
builder.add_edge("retrieve", "generate")
builder.add_edge("generate", END)
rag_chain = builder.compile()
# Use
result = await rag_chain.ainvoke({"question": "What are the main features?"})
print(result["answer"])
Detailed pattern documentation lives in references/details.md. Read that file when the navigation tier above is insufficient.