universal-ma-codebook
Universal Meta-Analysis Codebook v2.2 - AI-Human collaboration for meta-analysis data extraction. 4-layer design: Identifiers, Statistics, AI Provenance, Human Verification. Integrates with C5/C6/C7 agents and Category I systematic review pipeline. Triggers: meta-analysis, codebook, data extraction, Hedges g, effect size
From diverganpx claudepluginhub hosungyou/diverga --plugin divergaThis skill uses the workspace's default tool permissions.
Universal Meta-Analysis Codebook
Version: 2.2 Status: Production Codex Review: APPROVE WITH MINOR CHANGES (2026-01-26) Update: Context-specific extensions (2026-01-26)
Purpose
A universal, AI-Human collaboration codebook for meta-analysis that enables:
- AI extraction from PDFs (RAG/OCR) with confidence tracking
- Human verification of AI-extracted values
- 100% human-verified data through structured workflow
- Integration with Diverga C5/C6/C7 agents and Category I pipeline
- Context-specific extensions for domain-specific moderator variables
Context-Specific Extensions
The Universal Codebook supports project-specific moderator layers that extend the base 4-layer structure. Each meta-analysis context may have unique moderator variables.
Extension Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ UNIVERSAL CODEBOOK WITH CONTEXT EXTENSION │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ LAYER 1: IDENTIFIERS + METADATA (10 fields) ← Universal │
│ LAYER 2: CORE STATISTICAL VALUES (18 fields) ← Universal │
│ LAYER 3: CONTEXT-SPECIFIC MODERATORS ← Project Extension │
│ LAYER 4: AI EXTRACTION PROVENANCE ← Universal │
│ LAYER 5: HUMAN VERIFICATION (8 fields) ← Universal │
│ │
└─────────────────────────────────────────────────────────────────────┘
Available Context Extensions
| Context | Extension File | Moderator Count |
|---|---|---|
| GenAI-HE | GENAI_HE_CODEBOOK.md | 15 moderators |
| Clinical Trials | CLINICAL_CODEBOOK.md | TBD |
| Educational Tech | EDTECH_CODEBOOK.md | TBD |
Creating a Context Extension
- Define moderator variables specific to your research domain
- Create classification rules for categorical moderators
- Write AI extraction prompts for each moderator
- Configure C6 agent with the extension schema
# Example: Configure C6 for GenAI-HE context
c6.configure_extension(
context="genai_he",
moderators=[
{"name": "genai_tool", "type": "categorical", "values": ["ChatGPT", "Claude", ...]},
{"name": "blooms_level", "type": "ordinal", "values": ["remember", "understand", ...]},
{"name": "study_design", "type": "categorical", "values": ["RCT", "quasi", ...]},
],
extraction_prompts=GENAI_HE_PROMPTS
)
GenAI-HE Extension (Example)
Layer 3: GenAI-HE Moderator Variables (15 fields)
| Category | Fields |
|---|---|
| GenAI Tool | genai_tool, genai_tool_version, genai_access_type |
| Educational Outcome | blooms_level, outcome_dimension, learning_domain |
| Study Design | study_design, intervention_duration, intervention_type, control_condition |
| Context | education_level, discipline, country, sample_size_total, publication_type |
See: GenAI-HE-Review-AIMC/docs/GENAI_HE_CODEBOOK.md for full specification
Architecture: Four-Layer Design
┌─────────────────────────────────────────────────────────────────────┐
│ UNIVERSAL META-ANALYSIS CODEBOOK v2.1 │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ LAYER 1: IDENTIFIERS + METADATA (10 fields) │
│ study_id, es_id, citation, doi, year, design_type, │
│ timepoint, arm_label_treat, arm_label_control, unit_of_analysis │
│ │
│ LAYER 2: CORE STATISTICAL VALUES (18 fields) │
│ Primary: outcome_name → se_g (12) │
│ Change-score: pre_mean_treat, pre_sd_treat, pre_post_corr (3) │
│ Cluster: cluster_size, icc, n_clusters (3) │
│ │
│ LAYER 3: AI EXTRACTION PROVENANCE │
│ Per-value: ai_value, source, method, confidence, derived_from │
│ Stored as: ai_extraction_json │
│ │
│ LAYER 4: HUMAN VERIFICATION (8 fields) │
│ verified_status, verified_by, verified_date, corrections_json, │
│ disagreement_resolved, final_values_json, verification_notes, │
│ sign_off │
│ │
└─────────────────────────────────────────────────────────────────────┘
Workflow: AI-Human Collaboration
Phase 1: AI Extraction (Automated)
Triggered by: I3 RAG building completion or manual PDF upload
Agent: C6-DataIntegrityGuard
# C6 extracts statistical values from PDFs
extraction_result = c6.extract_with_provenance(
pdf_folder="./pdfs",
methods=["rag", "ocr"],
reconciliation="hierarchy",
log_all_candidates=True
)
Actions:
- I3 builds RAG from PDFs
- C6 queries for statistical values (M, SD, n)
- Multiple extraction methods run in parallel
- Conflict resolution applied (hierarchy + tolerance)
- Provenance recorded for all extractions
- Hedges' g calculated where inputs complete
Output: All rows → verified_status = PENDING
Phase 2: Triage (Automated)
Agent: C7-ErrorPreventionEngine
# C7 categorizes by effective confidence
triage_result = c7.triage_extractions(
data=extraction_result,
thresholds=CONFIGURABLE_THRESHOLDS
)
Categories:
| Confidence | Status | Action |
|---|---|---|
| HIGH (≥90%) | PROVISIONAL | Awaits sign-off |
| MEDIUM (70-89%) | PENDING | Recommended review |
| LOW (<70%) | PENDING | Required review (priority) |
| CONFLICT | PENDING | Required review (top priority) |
Phase 3: Human Review (Mandatory)
Interface: Excel Review Queue or Web UI
Critical Rule: ALL rows require human verification
Priority Queue:
- Conflicts detected (highest)
- LOW confidence
- MEDIUM confidence
- HIGH confidence (spot check)
Human Actions:
- Verify AI extraction against PDF
- Correct errors, record reason
- Mark as VERIFIED or REJECTED
- Resolve conflicts
Phase 4: Final Sign-Off
Agent: C5-MetaAnalysisMaster
# C5 validates all gates pass
validation = c5.validate_final(
data=verified_data,
require_all_verified=True,
require_all_signed_off=True
)
Requirements:
- All rows:
verified_status = VERIFIED - All rows:
sign_off = True - All gates pass (C5 validation)
Result: 100% Human-Verified Dataset
Field Specifications
Layer 1: Identifiers + Metadata
| Field | Type | Description | Example |
|---|---|---|---|
study_id | str | Unique study identifier | "CHEN_2024" |
es_id | str | Effect size ID | "CHEN_2024_01" |
citation | str | Full APA citation | "Chen et al. (2024)..." |
doi | str | DOI | "10.1000/xyz" |
year | int | Publication year | 2024 |
design_type | str | RCT|QUASI|PRE_POST | "RCT" |
timepoint | str | Measurement timing | "post" |
arm_label_treat | str | Treatment label | "ChatGPT group" |
arm_label_control | str | Control label | "Traditional" |
unit_of_analysis | str | individual|cluster | "individual" |
Layer 2: Core Statistical Values
Primary Statistics
| Field | Type | Required |
|---|---|---|
outcome_name | str | Yes |
outcome_unit | str | No |
es_type | str | Yes |
analysis_type | str | No |
n_treatment | int | Yes |
n_control | int | Yes |
m_treatment | float | Conditional |
sd_treatment | float | Conditional |
m_control | float | Conditional |
sd_control | float | Conditional |
hedges_g | float | Derived |
se_g | float | Derived |
Change-Score Fields (when es_type = CHANGE)
| Field | Type | Description |
|---|---|---|
pre_mean_treat | float | Pre-test mean |
pre_sd_treat | float | Pre-test SD |
pre_post_corr | float | Pre-post correlation (default 0.5) |
Cluster Fields (when unit_of_analysis = cluster)
| Field | Type | Description |
|---|---|---|
cluster_size | float | Average cluster size |
icc | float | Intra-class correlation |
n_clusters | int | Number of clusters |
Layer 3: AI Extraction Provenance
Stored in ai_extraction_json:
{
"n_treatment": {
"ai_value": 43,
"source": "Table 2, p.8",
"method": "OCR",
"confidence": 85,
"derived_from": null
},
"sd_treatment": {
"ai_value": 12.5,
"source": "Text p.11, 95% CI",
"method": "CALCULATED",
"confidence": 92,
"derived_from": "CI_95: SE = (14.8-10.2)/3.92"
}
}
Layer 4: Human Verification
| Field | Type | Values |
|---|---|---|
verified_status | str | PENDING|PROVISIONAL|VERIFIED|REJECTED |
verified_by | str | Reviewer initials |
verified_date | date | Review date |
corrections_json | json | {field: {ai_value, final_value, reason}} |
disagreement_resolved | bool | Conflict resolved? |
final_values_json | json | Human-confirmed values |
verification_notes | str | Free text notes |
sign_off | bool | Final approval |
Confidence Thresholds (Configurable)
Per-Field Thresholds
| Field | HIGH | MEDIUM | LOW |
|---|---|---|---|
| n (sample size) | ≥95% | 80-94% | <80% |
| M (mean) | ≥90% | 70-89% | <70% |
| SD | ≥85% | 65-84% | <65% |
| hedges_g (derived) | ≥92% | 75-91% | <75% |
| se_g (derived) | ≥92% | 75-91% | <75% |
| pre_post_corr | ≥85% | 65-84% | <65% |
| icc | ≥80% | 60-79% | <60% |
Per-Source Modifiers
| Source | Modifier |
|---|---|
| Structured table | +10% |
| Semi-structured figure | +5% |
| Unstructured text | 0% |
| Abstract only | -15% |
| OCR with artifacts | -20% |
Formula: effective_confidence = base_confidence + source_modifier
Conflict Resolution
Extraction Hierarchy
| Priority | Source | Weight |
|---|---|---|
| 1 | Table cell | 1.0 |
| 2 | Figure data | 0.9 |
| 3 | In-text stats | 0.8 |
| 4 | Abstract | 0.5 |
Tolerance Thresholds
| Value Type | Relative | Absolute |
|---|---|---|
| n (sample size) | 5% | 2 |
| M (mean) | 10% | 0.5 |
| SD | 15% | 0.5 |
Rule: If disagreement exceeds EITHER threshold → Human review required
Derived Value Verification
For calculated values (hedges_g, se_g), human verification means:
- All source values (M, SD, n) verified
- Formula appropriate for es_type
- If change-score: pre_post_corr verified or documented default
- If cluster: ICC and cluster_size verified
- Final values recalculated from verified inputs
Integration Points
Systematic Review Pipeline Integration
# After Stage 5 (RAG building)
# RAG query integration
rag = RAGQuery(project_path)
values = c6.extract_from_rag(
rag=rag,
fields=["n_treatment", "n_control", "m_treatment", "sd_treatment",
"m_control", "sd_control"],
fallback_to_ocr=True
)
C5/C6/C7 Agent Roles
| Agent | Role in Codebook |
|---|---|
| C5-MetaAnalysisMaster | Final validation, gate enforcement |
| C6-DataIntegrityGuard | Extraction, Hedges' g calculation |
| C7-ErrorPreventionEngine | Triage, conflict detection, warnings |
Excel Template Structure
Sheet 1: Codebook
One-page reference with field definitions
Sheet 2: Data (Main)
41 columns (39 visible + 2 JSON)
Sheet 3: Review Queue
Priority-ordered list of rows needing human review
Sheet 4: Extraction Log
Audit trail of all AI extractions
Success Metrics
| Metric | Target |
|---|---|
| AI extraction rate | ≥85% |
| AI confidence accuracy | ≥90% |
| Conflict detection rate | ≥95% |
| Human review completeness | 100% |
| Final sign-off rate | 100% |
| Data completeness (Hedges' g) | ≥95% |
Commands
# Initialize codebook for a project
diverga codebook init --project genai-he
# Import AI extractions
diverga codebook import --source rag --project genai-he
# Generate review queue
diverga codebook queue --project genai-he
# Validate final dataset
diverga codebook validate --project genai-he
References
- Plan document:
docs/plans/META_ANALYSIS_CODEBOOK_PLAN_V2.md - C5 agent:
.claude/skills/C5-meta-analysis-master/SKILL.md - C6 agent:
.claude/skills/C6-data-integrity-guard/SKILL.md - C7 agent:
.claude/skills/C7-error-prevention-engine/SKILL.md - Borenstein et al. (2021). Introduction to Meta-Analysis
- Cochrane Handbook Chapter 6: Extracting Data
- PRISMA 2020 Statement
Created: 2026-01-26 Codex Review: APPROVE WITH MINOR CHANGES Author: Claude Code