From magi-researchers
Generates structured Markdown research report from prior phase outputs like brainstorm, plans, code, tests, and plots. Integrates visuals, generates missing plots, verifies claim-evidence integrity.
npx claudepluginhub axect/magi-researchers --plugin magi-researchersThis skill uses the workspace's default tool permissions.
Generates a structured markdown research report from all previous phase outputs. Actively integrates existing plots, generates missing visualizations, and cross-verifies claim-evidence integrity. Requires at least some prior phase results to exist.
Orchestrates full research pipeline from Brainstorming to Reporting via Planning, Implementation, Testing & Visualization phases with user checkpoints. Configurable for physics, AI/ML, statistics, math domains, depth, and agent personas.
Writes structured post-experiment research reports from analysis artifacts, including executive summary, findings, limitations, next actions, and Obsidian Markdown output.
Generates structured ideas report after research direction selection: loads brainstorming logs/registries, fills reference gaps via searches, outputs with BibTeX in Typst/LaTeX/Markdown.
Share bugs, ideas, or general feedback.
Generates a structured markdown research report from all previous phase outputs. Actively integrates existing plots, generates missing visualizations, and cross-verifies claim-evidence integrity. Requires at least some prior phase results to exist.
/research-report [path/to/output/dir]
$ARGUMENTS — Optional path to the research output directory. If not provided, uses the most recent outputs/*/ directory.Shared rules: Read
${CLAUDE_PLUGIN_ROOT}/shared/rules.mdbefore starting. §MCP, §Claude-Only, §Visualization, §LaTeX apply to this skill. Inline fallback (if shared rules unavailable): Gemini models: gemini-3.1-pro-preview → gemini-2.5-pro → Claude. Codex: gpt-5.4. All math in LaTeX only (no Unicode). scienceplots['science','nature'], 300dpi PNG+PDF, Nature widths (3.5/7.2in). Subagents useReadtool.
See §Claude-Only in shared rules.
See §MCP, §Visualization in shared rules. Additionally:
brainstorm/synthesis.md (and other brainstorm files)brainstorm/weights.json (scoring weights)brainstorm/personas.md (assigned expert personas)brainstorm/debate_round2_gemini.md (adversarial debate, if --depth high was used)brainstorm/debate_round2_codex.md (adversarial debate, if --depth high was used)plan/research_plan.mdplan/murder_board.md (plan stress-testing results)plan/mitigations.md (murder board mitigations)plan/phase_gate.md (plan phase gate report)src/ contentssrc/phase_gate.md (implementation phase gate report)tests/ and test resultstests/phase_gate.md (test phase gate report)plots/ visualizationsplots/plot_manifest.json):
plots/ contains files, create the manifest by inventorying all .png/.pdf files in plots/ and generating metadata for each:
description, section_hint, caption, markdown_snippet (existing fields)style: array of style sheets used (e.g., ["science", "nature"])dpi: output resolution (e.g., 300)source_script: path to the Python script that generated this plotsource_function: function name within the script (if applicable)generation_date: ISO-8601 timestamp of plot generation${CLAUDE_PLUGIN_ROOT}/templates/report_template.md.${CLAUDE_PLUGIN_ROOT}/templates/domains/ for tone/style guidance.Before assembling content, validate that all existing plots comply with the required style:
Scan existing plots: For each plot in plots/ (or referenced in plot_manifest.json):
source_script in manifest, or search src/ and plots/ for Python files that produce each plot filename)scienceplots and calls plt.style.use(['science', 'nature'])plt.rcParams overrides that conflict with scienceplots (e.g., font family, linewidth, figure.facecolor)figsize uses Nature column widths (single: 3.5 in, double: 7.2 in)Flag non-compliant plots: If any plot fails validation: a. Write a regeneration script using the required style:
import matplotlib.pyplot as plt
import scienceplots
plt.style.use(['science', 'nature'])
# ... (reuse data loading from original script)
b. Ensure all text in the script is ASCII or LaTeX-escaped (no Unicode π, ², etc.)
c. Execute with uv run python {script_path}
d. Verify the regenerated plots exist and are non-empty
e. Update plots/plot_manifest.json with style metadata
If no plots exist yet: Skip to Step 1 (plots will be generated in Step 3 if needed).
Read all available materials:
brainstorm/synthesis.md — for Research Background and Brainstorming Summary sectionsbrainstorm/weights.json — for Brainstorming Summary (scoring weights used)brainstorm/personas.md — for Brainstorming Summary (expert personas assigned)brainstorm/debate_round2_*.md — for Brainstorming Summary (debate resolution, if available)plan/research_plan.md — for Methodology sectionplan/murder_board.md — for Methodology section (Plan Stress Testing subsection)plan/mitigations.md — for Methodology section (mitigation strategies)plan/phase_gate.md — for Appendix F (Quality Assurance)src/ — for Implementation sectionsrc/phase_gate.md — for Appendix Ftests/ — for Testing sectiontests/phase_gate.md — for Appendix Fplots/plot_manifest.json — for Results & Visualization sectionPlot-to-Section Mapping:
Using the section_hint field from the manifest, assign each plot to a report section:
results → Section 5 (Results & Visualization)methodology → Section 3 (Methodology)validation → Section 5 or Section 6 (Testing)comparison → Section 5 (Results & Visualization)testing → Section 6 (Testing)Using the template structure from ${CLAUDE_PLUGIN_ROOT}/templates/report_template.md, generate the report:
Section 1 — Research Background:
Section 2 — Brainstorming Summary:
personas.md)weights.json)debate_round2_*.md exists): Summarize key disagreements and their resolutionsSection 3 — Methodology:
murder_board.md exists): Summarize the murder board's critical findings and the mitigations applied (from mitigations.md). This demonstrates that the methodology was adversarially tested before implementation.methodology-tagged plots here with their captionsSection 4 — Implementation:
Section 5 — Results & Visualization:
results/comparison-tagged plot from the manifest:
markdown_snippetAnti-pattern: Do NOT list figures in a table at the end of the report. Every figure must be embedded inline with
immediately before or after the paragraph that discusses it. Orphaned figure tables at the end of the report are a report quality failure.
Section 6 — Testing:
validation/testing-tagged plots with their captionsSection 7 — Conclusion:
Appendix:
After completing the initial draft, perform a gap analysis:
Identify missing visualizations: Review the draft and ask:
If gaps are found (and iteration count < 2): a. For each needed plot, write a self-contained Python script using:
import matplotlib.pyplot as plt
import scienceplots
plt.style.use(['science', 'nature'])
b. Execute the script with uv run python {script_path}
c. Save plots to plots/ as both PNG (300 dpi) and PDF
d. Update plots/plot_manifest.json with the new plot entries
e. Re-draft the affected report sections to integrate the new plots
f. Increment the iteration counter
If no gaps are found (or iteration limit reached): Proceed to Step 4.
Loop constraints (scaled by depth):
| Depth | Max iterations | Max plots per iteration | Total plot budget |
|---|---|---|---|
min | 1 | 2 | 2 |
default | 2 | 3 | 6 |
high | 3 | 4 | 12 |
max | 3 | 5 | 15 |
default.src/ or test outputs — do NOT fabricate dataBefore the MAGI traceability review, run the automated validator to catch structural issues early:
uv run python ${CLAUDE_PLUGIN_ROOT}/utils/validate_draft.py {output_dir}/report.md --jsonstatus:
"pass" → Proceed to Step 4."fail" → Fix all errors before proceeding:
<!-- EVIDENCE BLOCK: ev-X --> annotations for unsupported claims$$...$$ on separate lines"fail" status.Execute these two review calls simultaneously (in the same message):
Gemini (BALTHASAR) — Scientific Rigor Review:
mcp__gemini-cli__ask-gemini(
prompt: "You are a scientific reviewer. Analyze this research report for claim-evidence integrity. Identify:\n\n1. **Orphaned claims**: Text assertions that lack a supporting figure, table, or data reference\n2. **Orphaned plots**: Figures that are embedded but never discussed or interpreted in the text\n3. **Weak links**: Claims that reference a figure but the figure doesn't clearly support the claim\n4. **Caption quality**: Are figure captions precise, quantitative, and publication-ready?\n\nFor each issue found, specify the section, the problematic text or figure, and a concrete fix.\n\nReport draft:\n@{output_dir}/report.md\n\nPlot manifest:\n@{output_dir}/plots/plot_manifest.json",
model: "gemini-3.1-pro-preview" // fallback: "gemini-2.5-pro" → Claude
)
Codex (CASPER) — Visualization Quality Review:
mcp__codex-cli__ask-codex(
prompt: "You are a data visualization reviewer. Analyze this research report for visualization quality and completeness. Identify:\n\n1. **Missing visualizations**: Quantitative results or comparisons described in text that would benefit from a chart/plot but have none\n2. **Plot-narrative mismatch**: Figures whose captions or surrounding text don't accurately describe what the plot shows\n3. **Visualization improvements**: Existing plots that could use better chart types, scales, or encodings for clarity\n4. **Reproducibility gaps**: Plots that lack source context or data references needed to regenerate them\n5. **Style compliance**: Are all figures generated with the required scienceplots style? Check for: serif fonts, thin lines, Nature-compatible sizing (single: 3.5in, double: 7.2in), 300 dpi, PDF availability. Flag any plot that appears to use matplotlib defaults or custom rcParams overrides.\n\nFor each issue found, specify the section, the problematic text or figure, and a concrete fix.\n\nReport draft:\n@{output_dir}/report.md\n\nPlot manifest:\n@{output_dir}/plots/plot_manifest.json",
model: "gpt-5.4"
)
Note: If Codex MCP is unavailable, fall back to
mcp__gemini-cli__ask-geminiwith the Gemini fallback chain and visualization-focused framing.
If
--claude-only: Per §SubagentExec, spawn simultaneously:
- A (CD, BALTHASAR — Scientific Rigor): Read
report.md+plots/plot_manifest.json. Review: 1.Orphaned claims (text without supporting figure/table/data), 2.Orphaned plots (embedded but never discussed), 3.Weak links (claim→figure mismatch), 4.Caption quality (precise, quantitative, publication-ready?). Per issue: section, problematic text/figure, concrete fix. Return structured text.- B (AC, CASPER — Visualization Quality): Read
report.md+plots/plot_manifest.json. Review: 1.Missing visualizations, 2.Plot-narrative mismatch, 3.Visualization improvements (chart type, scales, encodings), 4.Reproducibility gaps, 5.Style compliance (serif fonts, Nature sizing 3.5in/7.2in, 300dpi, PDF available, no matplotlib defaults/custom rcParams). Per issue: section, text/figure, concrete fix. Return structured text.
Claude (MELCHIOR) — Synthesis & Revision:
After both reviews are received, synthesize the feedback:
report_versions.json exists, read current_version and increment.report.md → report_v{N-1}.mdreport.md.report_versions.json:
{
"schema_version": "1.1.0",
"current_version": 1,
"versions": [{
"version": 1,
"file": "report.md",
"created_at": "ISO-8601",
"feedback_tier": null,
"feedback_summary": "Initial report",
"changes": []
}]
}
The changes array tracks structured diffs for each version. Each entry has:
type: one of "plot_restyle", "plot_new", "text_fix", "section_rewrite", "style_fix", "gap_fill"files (for plot changes): array of affected plot filenamessection (for text changes): section number or namereason: brief explanation of why the change was madeMaximum 3 feedback iterations per entry into this step.
Step 6a: Solicit Feedback
Present the report and ask:
"Report v{N} is ready at report.md. Please review and provide feedback, or say approve to finalize."
Step 6b: Classify Feedback
When user provides feedback (not "approve"):
Classify into one of three tiers using these keyword signals:
Tier 1 (Cosmetic) — wording, tone, structure, formatting, caption rewording
Tier 2 (Visualization) — new/modified plots, chart type changes, scale changes, plot-narrative linkage
Tier 3 (Substantive) — code changes, re-execution, different methodology, new experiments
If the feedback does not clearly match any tier's signals, ask the user to confirm before proceeding.
Present classification: "I classify this as Tier {N} ({name}). Planned action: {description}. Proceed?"
If user disagrees, re-classify.
Mixed-tier feedback: decompose and apply Tier 1/2 first, then escalate Tier 3.
Step 6c: Apply — Tier 1 (Cosmetic)
report.md → report_v{N}.mdreport.mdreport_versions.json (increment version, tier=1, summary)Step 6d: Apply — Tier 2 (Visualization)
report.md → report_v{N}.mduv run python {script_path}
c. Save to plots/ (PNG 300dpi + PDF)
d. Update plots/plot_manifest.jsonreport.mdreport_versions.json (increment version, tier=2, summary)Step 6e: Apply — Tier 3 (Substantive / Escalation)
This skill cannot handle substantive changes alone.
/research-report): Inform user which phase needs re-running. Suggest /research --resume {output_dir}. Exit loop./research): Return control to orchestrator for outer-loop handling (see orchestrator Step R2).Step 6f: Finalize
When user approves or iteration limit reached:
See §LaTeX in shared rules.
report.md # Final report (always latest version)
report_v{N}.md # Archived report versions (created on feedback)
report_versions.json # Version manifest with feedback history
plots/
├── *.png # Plot images (300 dpi)
├── *.pdf # Plot vector versions
└── plot_manifest.json # Plot registry