Implements RAG pipelines with document chunking, embedding generation, vector storage, and retrieval. Use for Q&A systems over documents, chatbots with knowledge bases, and reducing AI hallucinations.
From developer-kit-ainpx claudepluginhub giuseppe-trisciuoglio/developer-kit --plugin developer-kit-aiThis skill is limited to using the following tools:
assets/retriever-pipeline.javaassets/vector-store-config.yamlreferences/document-chunking.mdreferences/embedding-models.mdreferences/langchain4j-rag-guide.mdreferences/retrieval-strategies.mdreferences/vector-databases.mdGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Details PluginEval's skill quality evaluation: 3 layers (static, LLM judge), 10 dimensions, rubrics, formulas, anti-patterns, badges. Use to interpret scores, improve triggering, calibrate thresholds.
Build Retrieval-Augmented Generation systems that extend AI capabilities with external knowledge sources.
This skill covers: document processing, embedding generation, vector storage, retrieval configuration, and RAG pipeline implementation.
Select based on your requirements:
| Requirement | Recommended |
|---|---|
| Production scalability | Pinecone, Milvus |
| Open-source | Weaviate, Qdrant |
| Local development | Chroma, FAISS |
| Hybrid search | Weaviate with BM25 |
| Use Case | Model |
|---|---|
| General purpose | text-embedding-ada-002 |
| Fast and lightweight | all-MiniLM-L6-v2 |
| Multilingual | e5-large-v2 |
| Best performance | bge-large-en-v1.5 |
Validation: Verify embeddings were generated successfully:
List<Embedding> embeddings = embeddingModel.embedAll(segments);
if (embeddings.isEmpty() || embeddings.get(0).dimension() != expectedDim) {
throw new IllegalStateException("Embedding generation failed");
}
Choose the appropriate strategy:
Validation: Test with known queries to verify context injection works correctly.
Error Handling: For batch ingestion, wrap in retry logic:
for (Document doc : documents) {
int attempts = 0;
while (attempts < 3) {
try {
store.add(embeddingModel.embed(doc).content(), doc.toTextSegment());
break;
} catch (EmbeddingException e) {
attempts++;
if (attempts == 3) throw new RuntimeException("Failed after 3 retries", e);
}
}
}
List<Document> documents = FileSystemDocumentLoader.loadDocuments("/docs");
InMemoryEmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(documents, store);
DocumentAssistant assistant = AiServices.builder(DocumentAssistant.class)
.chatModel(chatModel)
.contentRetriever(EmbeddingStoreContentRetriever.from(store))
.build();
String answer = assistant.answer("What is the company policy on remote work?");
EmbeddingStoreContentRetriever retriever = EmbeddingStoreContentRetriever.builder()
.embeddingStore(store)
.embeddingModel(embeddingModel)
.maxResults(5)
.minScore(0.7)
.filter(metadataKey("category").isEqualTo("technical"))
.build();
ContentRetriever webRetriever = EmbeddingStoreContentRetriever.from(webStore);
ContentRetriever docRetriever = EmbeddingStoreContentRetriever.from(docStore);
List<Content> results = new ArrayList<>();
results.addAll(webRetriever.retrieve(query));
results.addAll(docRetriever.retrieve(query));
List<Content> topResults = reranker.reorder(query, results).subList(0, 5);
Assistant assistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.contentRetriever(retriever)
.build();
assistant.chat("Tell me about the product features");
assistant.chat("What about pricing for those features?"); // Maintains context