Skill

rag-design

Design a RAG architecture for a use case

From systems-design
Install
1
Run in your terminal
$
npx claudepluginhub melodic-software/claude-code-plugins --plugin systems-design
Tool Access

This skill is limited to using the following tools:

ReadGlobGrepSkillTask
Skill Content

Design RAG Architecture

Design a Retrieval-Augmented Generation system for a given use case.

Arguments

$ARGUMENTS - The RAG use case to design for (e.g., "customer support chatbot", "documentation Q&A", "legal document search", "code assistant")

Workflow

  1. Clarify requirements by understanding:

    • What type of questions will be asked?
    • What is the document corpus size and type?
    • What is the required accuracy/faithfulness?
    • What is the latency budget?
    • Are there multi-turn conversation requirements?
  2. Load relevant skills based on the use case:

    • RAG patterns → rag-architecture
    • Vector store selection → vector-databases
    • LLM serving → llm-serving-patterns
    • Inference optimization → ml-inference-optimization
  3. Spawn the rag-architect agent for comprehensive design:

    • Use Task tool with subagent_type="rag-architect"
    • Provide full use case context and requirements
    • Request end-to-end RAG architecture
  4. Design the ingestion pipeline:

    • Document extraction (PDF, HTML, code)
    • Chunking strategy selection
    • Embedding model selection
    • Vector database configuration
    • Metadata extraction and indexing
  5. Design the retrieval pipeline:

    • Query processing (expansion, HyDE)
    • Retrieval strategy (dense, sparse, hybrid)
    • Reranking approach
    • Context assembly
    • Prompt engineering
  6. Address quality and scale:

    • Retrieval accuracy (recall@k, MRR)
    • Answer faithfulness (grounding)
    • Latency budget allocation
    • Cost optimization
    • Scaling strategy

Example Usage

/sd:rag-design customer support chatbot with 10K FAQ documents
/sd:rag-design internal documentation Q&A for engineering team
/sd:rag-design legal document search for contract review
/sd:rag-design code assistant for enterprise codebase
/sd:rag-design research paper Q&A with 100K papers
/sd:rag-design product catalog search with structured data
/sd:rag-design multi-lingual knowledge base

Use Case Categories

CategoryKey Considerations
Customer SupportFAQ coverage, escalation, tone consistency
DocumentationTechnical accuracy, code examples, versioning
Legal/ComplianceCitation accuracy, audit trails, access control
Code AssistanceAST-aware chunking, context relevance, IDE integration
Research/AcademicMulti-document reasoning, citation, long-form answers
E-commerceProduct attributes, inventory awareness, personalization

RAG Pattern Selection Guide

ComplexityPatternWhen to Use
LowBasic RAGSimple Q&A, small corpus
MediumRAG + RerankingHigher accuracy needed
MediumHybrid SearchMixed keyword + semantic queries
HighQuery-TransformedVague or complex queries
HighAgentic RAGMulti-hop reasoning, tool use

Output

A comprehensive RAG system architecture including:

  • Ingestion pipeline (documents → vectors)
  • Retrieval pipeline (query → context)
  • Technology stack (embedding model, vector DB, LLM)
  • Quality targets (recall, faithfulness, latency)
  • Trade-offs and alternatives
  • Cost estimate (per-query and monthly)
Stats
Parent Repo Stars40
Parent Repo Forks6
Last CommitFeb 15, 2026