Help us improve
Share bugs, ideas, or general feedback.
From developer-kit-ai
Implements RAG pipelines with document chunking, embedding generation, vector storage, and retrieval. Use for Q&A systems over documents, chatbots with knowledge bases, and reducing AI hallucinations.
npx claudepluginhub giuseppe-trisciuoglio/developer-kit --plugin developer-kit-aiHow this skill is triggered — by the user, by Claude, or both
Slash command
/developer-kit-ai:ragThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Build Retrieval-Augmented Generation systems that extend AI capabilities with external knowledge sources.
<!-- AUTO-GENERATED by export-plugins.py — DO NOT EDIT -->
RAG (Retrieval Augmented Generation) implementation patterns including document chunking, embedding generation, vector database integration, semantic search, and RAG pipelines. Use when building RAG systems, implementing semantic search, creating knowledge bases, or when user mentions RAG, embeddings, vector database, retrieval, document chunking, or knowledge retrieval.
Build RAG systems for LLM apps using vector databases, embeddings, and retrieval strategies. Use for document Q&A, grounded chatbots, and semantic search.
Share bugs, ideas, or general feedback.
Build Retrieval-Augmented Generation systems that extend AI capabilities with external knowledge sources.
This skill covers: document processing, embedding generation, vector storage, retrieval configuration, and RAG pipeline implementation.
Select based on your requirements:
| Requirement | Recommended |
|---|---|
| Production scalability | Pinecone, Milvus |
| Open-source | Weaviate, Qdrant |
| Local development | Chroma, FAISS |
| Hybrid search | Weaviate with BM25 |
| Use Case | Model |
|---|---|
| General purpose | text-embedding-ada-002 |
| Fast and lightweight | all-MiniLM-L6-v2 |
| Multilingual | e5-large-v2 |
| Best performance | bge-large-en-v1.5 |
Validation: Verify embeddings were generated successfully:
List<Embedding> embeddings = embeddingModel.embedAll(segments);
if (embeddings.isEmpty() || embeddings.get(0).dimension() != expectedDim) {
throw new IllegalStateException("Embedding generation failed");
}
Choose the appropriate strategy:
Validation: Test with known queries to verify context injection works correctly.
Error Handling: For batch ingestion, wrap in retry logic:
for (Document doc : documents) {
int attempts = 0;
while (attempts < 3) {
try {
store.add(embeddingModel.embed(doc).content(), doc.toTextSegment());
break;
} catch (EmbeddingException e) {
attempts++;
if (attempts == 3) throw new RuntimeException("Failed after 3 retries", e);
}
}
}
List<Document> documents = FileSystemDocumentLoader.loadDocuments("/docs");
InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(documents, store);
DocumentAssistant assistant = AiServices.builder(DocumentAssistant.class)
.chatModel(chatModel)
.contentRetriever(EmbeddingStoreContentRetriever.from(store))
.build();
String answer = assistant.answer("What is the company policy on remote work?");
EmbeddingStoreContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
.embeddingStore(store)
.embeddingModel(embeddingModel)
.maxResults(5)
.minScore(0.7)
.filter(metadataKey("category").isEqualTo("technical"))
.build();
ContentRetriever webRetriever = EmbeddingStoreContentRetriever.from(webStore);
ContentRetriever docRetriever = EmbeddingStoreContentRetriever.from(docStore);
List<Content> results = new ArrayList<>();
results.addAll(webRetriever.retrieve(query));
results.addAll(docRetriever.retrieve(query));
List<Content> topResults = reranker.reorder(query, results).subList(0, 5);
Assistant assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.contentRetriever(retriever)
.build();
assistant.chat("Tell me about the product features");
assistant.chat("What about pricing for those features?"); // Maintains context