From atum-ai-ml
Corrective RAG (CRAG) pattern library — implementation of the Corrective Retrieval Augmented Generation paradigm by Yan et al. 2024 (Corrective Retrieval Augmented Generation, ICLR 2024) which improves classical RAG by adding a retrieval evaluator that grades the relevance of retrieved documents and triggers fallback mechanisms when the retrieval is judged insufficient. Covers the core CRAG flow (retrieve documents from vector store, grade each document via lightweight T5 evaluator or LLM-as-judge with categories Correct/Incorrect/Ambiguous, when Correct use as-is, when Ambiguous combine knowledge refinement with web search, when Incorrect discard and rely on web search), the knowledge refinement step (decompose retrieved documents into strips, filter strips by relevance, re-compose into clean context), the web search fallback (typically using Google Search API, Brave Search, Tavily, Firecrawl, Exa to fetch fresh sources when internal knowledge base fails), benchmark gains reported in the paper (PopQA +20%, Biography +25%, PubHealth +10% over standard RAG), comparison with alternative RAG variants (HyDE for hypothetical document embedding, Self-RAG with self-reflection tokens, Adaptive RAG that decides when to retrieve), implementation strategies (lightweight evaluator vs LLM-as-judge trade-off, web fallback cost management, hybrid local+web context fusion), production considerations (latency added by evaluator step, web API costs, hallucination risk reduction, compliance for web fetching), use cases where CRAG dominates (open-domain QA with risk of stale knowledge base, fact-checking applications, customer support with both static KB and dynamic web sources), and the limitations (overhead of evaluator, dependency on web search quality, complexity vs simple RAG). Use when standard RAG hallucinates due to poor retrieval, when knowledge base coverage is incomplete and web augmentation is acceptable, or when you need a robust fallback mechanism. Differentiates from generic RAG by deep focus on retrieval quality grading and fallback orchestration.
npx claudepluginhub arnwaldn/atum-plugins-collection --plugin atum-ai-mlThis skill uses the workspace's default tool permissions.
Pattern publié par **Yan et al. 2024** (Université des Sciences et Technologies de Chine, ICLR 2024). "Corrective Retrieval Augmented Generation" résout le problème principal du RAG classique : **que faire quand les documents récupérés sont mauvais ?**
Executes pre-written implementation plans: critically reviews, follows bite-sized steps exactly, runs verifications, tracks progress with checkpoints, uses git worktrees, stops on blockers.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Guides idea refinement into designs: explores context, asks questions one-by-one, proposes approaches, presents sections for approval, writes/review specs before coding.
Pattern publié par Yan et al. 2024 (Université des Sciences et Technologies de Chine, ICLR 2024). "Corrective Retrieval Augmented Generation" résout le problème principal du RAG classique : que faire quand les documents récupérés sont mauvais ?
Question → [Retrieve top-k docs] → [LLM Generate] → Réponse
Si les docs récupérés sont non pertinents ou incorrects, le LLM va :
CRAG résout ça en évaluant la qualité du retrieval et en adaptant la stratégie.
[QUESTION]
│
▼
┌──────────────────┐
│ RETRIEVE │ ← Vector store / BM25 / hybrid
└────────┬─────────┘
│ documents top-k
▼
┌──────────────────┐
│ GRADE DOCS │ ← Lightweight evaluator (T5) or LLM
│ (Correct/Inc. │
│ /Ambiguous) │
└────────┬─────────┘
│
┌──────────┼──────────┐
│ │ │
▼ ▼ ▼
CORRECT AMBIGUOUS INCORRECT
│ │ │
│ ▼ ▼
│ ┌──────────┐ ┌──────────┐
│ │ KNOWLEDGE│ │ WEB │
│ │ REFINE + │ │ SEARCH │
│ │WEB SEARCH│ │ FALLBACK │
│ └────┬─────┘ └────┬─────┘
│ │ │
└────────┴─────────────┘
│
▼
[GENERATE]
│
▼
[ANSWER]
Score chaque document récupéré comme Correct / Ambiguous / Incorrect.
Option A — Lightweight T5 (du papier) :
Option B — LLM-as-judge :
Question: {question}
Document: {document}
Évalue ce document sur une échelle de 0 à 10 selon sa pertinence
pour répondre à la question. Retourne JSON :
{"score": 0-10, "verdict": "correct"|"ambiguous"|"incorrect", "reason": "..."}
Seuils typiques :
Le document brut est décomposé en strips (paragraphes ou phrases), chaque strip est filtré individuellement, puis recomposé.
def refine_knowledge(document, question):
strips = split_into_strips(document) # par paragraphe ou phrase
filtered = []
for strip in strips:
score = evaluator(question, strip)
if score >= threshold:
filtered.append(strip)
return "\n\n".join(filtered)
Élimine le bruit dans des documents longs où seule une partie est pertinente.
Si la KB locale échoue, on enrichit via recherche web.
APIs courantes :
| API | Quand l'utiliser | Coût |
|---|---|---|
| Tavily | LLM-friendly, filtré par défaut | $0.04 / search |
| Brave Search | Privacy-friendly, fallback | $0.005 / query |
| Google Custom Search | Stack Google | $5 / 1k queries |
| Bing Search | Stack Microsoft | $7 / 1k queries |
| Firecrawl | Scrape de pages spécifiques | $0.001 / page |
| Exa.ai | Recherche sémantique LLM-native | $0.005 / search |
| SerpAPI | Polyvalent, multi-engines | $50/mois |
def web_search_fallback(question):
results = tavily.search(question, max_results=5, include_raw=True)
docs = [r["content"] for r in results["results"]]
return refine_knowledge("\n\n".join(docs), question)
LLM final qui génère la réponse à partir du contexte refiné (local + web).
Question: "Quel est le PIB du Vietnam en 2025 ?"
Step 1: Retrieve from internal KB
Doc1: "Le Vietnam en 2020 avait un PIB de 271 milliards USD..."
Doc2: "L'économie vietnamienne est principalement..."
Step 2: Grade docs
Doc1: ambiguous (info de 2020, pas 2025)
Doc2: incorrect (pas de chiffres)
Step 3: Verdict global = "ambiguous + incorrect"
→ Trigger web search fallback
Step 4: Web search "Vietnam GDP 2025"
Result: "Vietnam's GDP reached $476 billion in 2024, with World Bank
projecting $510 billion for 2025..."
Step 5: Knowledge refinement
Filtré: "Vietnam's GDP reached $510 billion projected for 2025"
Step 6: Generate final answer with refined context
"Selon les projections de la Banque Mondiale, le PIB du Vietnam en
2025 est estimé à environ 510 milliards USD."
| Benchmark | Standard RAG | Self-RAG | CRAG |
|---|---|---|---|
| PopQA (open-domain QA) | 38.7% | 54.9% | 59.8% (+21pts vs RAG) |
| Biography (long-form) | 64.0% | 81.2% | 86.0% (+22pts) |
| PubHealth (fact-checking) | 65.0% | 75.6% | 80.6% (+15pts) |
Bénéfice typique : 95% de pertinence retrieval (vs 70% pour RAG classique).
| Pattern | Différence |
|---|---|
| RAG classique | Pas d'évaluation, pas de fallback |
| HyDE (Hypothetical Document Embeddings) | Génère un doc hypothétique pour améliorer la query, mais pas de fallback |
| Self-RAG (Asai 2023) | LLM apprend des tokens spéciaux pour décider quand retrieve, requiert fine-tuning |
| Adaptive RAG (Jeong 2024) | Décide entre no-retrieval / single-step / multi-step selon la complexité |
| CRAG | Plug-and-play, pas de fine-tuning, fallback web automatique |
from langgraph.graph import StateGraph, END
class CRAGState(TypedDict):
question: str
documents: list
grade: str
web_search_done: bool
final_answer: str
workflow = StateGraph(CRAGState)
workflow.add_node("retrieve", retrieve_docs)
workflow.add_node("grade", grade_docs)
workflow.add_node("refine", refine_knowledge)
workflow.add_node("web_search", web_search_fallback)
workflow.add_node("generate", generate_answer)
workflow.set_entry_point("retrieve")
workflow.add_edge("retrieve", "grade")
workflow.add_conditional_edges("grade", route_by_grade, {
"correct": "refine",
"ambiguous": "web_search",
"incorrect": "web_search",
})
workflow.add_edge("web_search", "refine")
workflow.add_edge("refine", "generate")
workflow.add_edge("generate", END)
app = workflow.compile()
| Scénario | Recommandation |
|---|---|
| KB stable, domaine fermé, hallucination tolérée | RAG simple suffit |
| KB partielle, web augmentation OK | CRAG |
| KB volatile, infos qui changent souvent | CRAG |
| Compliance interdit web search | RAG simple, refine knowledge only |
| Budget serré (latence + coût) | RAG simple ou Adaptive RAG |
| Fact-checking, médical, légal | CRAG (besoin de robustesse) |
rag-architect (ce plugin)pinecone-patterns / weaviate-patterns / qdrant-patterns (ce plugin)react-pattern (ce plugin)reflexion-pattern (ce plugin)graphrag-pattern (ce plugin)