From antigravity-awesome-skills
Builds and optimizes Retrieval-Augmented Generation (RAG) systems mastering embedding models, vector databases, chunking strategies, retrieval, and hybrid search for LLM apps.
npx claudepluginhub sickn33/antigravity-awesome-skillsThis skill uses the workspace's default tool permissions.
Expert in building Retrieval-Augmented Generation systems. Masters embedding models,
Builds and optimizes Retrieval-Augmented Generation (RAG) systems mastering embedding models, vector databases, chunking strategies, retrieval, and hybrid search for LLM apps.
Designs production-grade RAG systems: chunking documents, generating embeddings, configuring vector stores, hybrid search pipelines, reranking, and retrieval evaluation. For vector DBs and semantic search apps.
Designs and implements production-grade RAG systems: chunking documents, embeddings, vector stores, hybrid search pipelines, reranking, retrieval evaluation. For semantic search in knowledge-grounded AI apps.
Share bugs, ideas, or general feedback.
Expert in building Retrieval-Augmented Generation systems. Masters embedding models, vector databases, chunking strategies, and retrieval optimization for LLM applications.
Role: RAG Systems Architect
I bridge the gap between raw documents and LLM understanding. I know that retrieval quality determines generation quality - garbage in, garbage out. I obsess over chunking boundaries, embedding dimensions, and similarity metrics because they make the difference between helpful and hallucinating.
Chunk by meaning, not arbitrary token counts
When to use: Processing documents with natural sections
Multi-level retrieval for better precision
When to use: Large document collections with varied granularity
Combine semantic and keyword search
When to use: Queries may be keyword-heavy or semantic
Expand queries to improve recall
When to use: User queries are short or ambiguous
Compress retrieved context to fit window
When to use: Retrieved chunks exceed context limits
Pre-filter by metadata before semantic search
When to use: Documents have structured metadata
Severity: HIGH
Situation: Using fixed token/character limits for chunking
Symptoms:
Why this breaks: Fixed-size chunks split mid-sentence, mid-paragraph, or mid-idea. The resulting embeddings represent incomplete thoughts, leading to poor retrieval quality. Users search for concepts but get fragments.
Recommended fix:
Use semantic chunking that respects document structure:
Severity: MEDIUM
Situation: Only using vector similarity, ignoring metadata
Symptoms:
Why this breaks: Semantic search finds semantically similar content, but not necessarily relevant content. Without metadata filtering, you return old docs when user wants recent, wrong categories, or inapplicable content.
Recommended fix:
Implement hybrid filtering:
Severity: MEDIUM
Situation: One embedding model for code, docs, and structured data
Symptoms:
Why this breaks: Embedding models are trained on specific content types. Using a text embedding model for code, or a general model for domain-specific content, produces poor similarity matches.
Recommended fix:
Evaluate embeddings per content type:
Severity: MEDIUM
Situation: Taking top-K from vector search without reranking
Symptoms:
Why this breaks: First-stage retrieval (vector search) optimizes for recall, not precision. The top results by embedding similarity may not be the most relevant for the specific query. Cross-encoder reranking dramatically improves precision for the final results.
Recommended fix:
Add reranking step:
Severity: MEDIUM
Situation: Using all retrieved context regardless of relevance
Symptoms:
Why this breaks: More context isn't always better. Irrelevant context confuses the LLM, increases latency and cost, and can cause the model to ignore the most relevant information. Models have attention limits.
Recommended fix:
Use relevance thresholds:
Severity: HIGH
Situation: Only evaluating end-to-end RAG quality
Symptoms:
Why this breaks: If answers are wrong, you can't tell if retrieval failed or generation failed. This makes debugging impossible and leads to wrong fixes (tuning prompts when retrieval is the problem).
Recommended fix:
Separate retrieval evaluation:
Severity: MEDIUM
Situation: Embeddings generated once, never refreshed
Symptoms:
Why this breaks: Documents change but embeddings don't. Users retrieve outdated content or, worse, content that no longer exists. This erodes trust in the system.
Recommended fix:
Implement embedding refresh:
Severity: MEDIUM
Situation: Using pure semantic search for keyword-heavy queries
Symptoms:
Why this breaks: Some queries are keyword-oriented (looking for specific terms) while others are semantic (looking for concepts). Pure semantic search fails on exact matches; pure keyword search fails on paraphrases.
Recommended fix:
Implement hybrid search:
Works well with: ai-agents-architect, prompt-engineer, database-architect, backend