From magi-researchers
Creates tests for research code in src/ and generates publication-quality PNG/PDF visualizations registered in plot_manifest.json. Auto-selects tools matching detected workspace languages/ecosystems.
npx claudepluginhub axect/magi-researchers --plugin magi-researchersThis skill uses the workspace's default tool permissions.
Creates tests for research code and generates publication-quality visualizations. Requires
Generates structured Markdown research report from prior phase outputs like brainstorm, plans, code, tests, and plots. Integrates visuals, generates missing plots, verifies claim-evidence integrity.
Generates publication-quality Python plots for scientific papers using matplotlib/seaborn, top-journal color schemes like Nature/Science, 450 DPI PNG/SVG exports, and data manifests.
Creates journal-ready scientific plots with matplotlib, seaborn, plotly. Supports multi-panel layouts, error bars, significance markers, colorblind-safe palettes, PDF/EPS/TIFF exports.
Share bugs, ideas, or general feedback.
Creates tests for research code and generates publication-quality visualizations. Requires
implemented code in src/ of an active research output directory.
This skill does not execute the main research pipeline — that is Phase 3.5's responsibility
(/research-execute). If results/ is already populated, tests and visualizations use those
artifacts. If not, tests use mocks/fixtures and visualizations use inline-computed values.
Tool selection: Claude autonomously chooses testing and visualization tools to match the
languages and ecosystems already present in src/. No specific framework is mandated.
/research-test [path/to/output/dir]
$ARGUMENTS — Optional path to the research output directory. If not provided, uses the most recent outputs/*/ directory.Shared rules: Read
${CLAUDE_PLUGIN_ROOT}/shared/rules.mdbefore starting. §MCP, §Claude-Only, §Visualization apply to this skill. Inline fallback (if shared rules unavailable): Gemini models: gemini-3.1-pro-preview → gemini-2.5-pro → Claude. Codex: gpt-5.4. scienceplots['science','nature'], 300dpi PNG+PDF, Nature widths (3.5/7.2in). Subagents useReadtool.
See §Claude-Only in shared rules.
See §MCP in shared rules. Additionally:
Regardless of tool choices, all outputs must satisfy the following contracts. These are the interface between Phase 4 and Phase 5 (Report) — violating them breaks the pipeline.
| Restriction | Requirement | Rationale |
|---|---|---|
| Plot Manifest | All visualizations must be registered in plots/plot_manifest.json using the fixed schema below | Phase 5 (Report) reads this file to assemble the report |
| Dual Format | Every plot must be saved as both PNG (300 dpi) and PDF or SVG | PNG for preview, PDF/SVG for publication |
| Execution Evidence | At least one test or verification must confirm the code runs without error | Validates implementation correctness |
| Dependency Spec | Any new test/viz dependencies must be added to the appropriate manifest (pyproject.toml, Cargo.toml, DESCRIPTION, etc.) | Reproducibility |
Find the active research output directory (from $ARGUMENTS or most recent outputs/*/).
Verify src/ exists and contains implementation code.
Read plan/research_plan.md for research context, test strategy guidance, and the YAML
frontmatter's languages/ecosystem fields.
Read all source files in src/ to understand what needs testing.
Workspace Detection — scan for languages and ecosystems actually present:
1st signal — Package manager files (scan project root and src/):
| File | Ecosystem |
|---|---|
Cargo.toml | Rust / cargo |
pyproject.toml, requirements.txt, uv.lock | Python / uv or pip |
Project.toml | Julia |
DESCRIPTION | R |
CMakeLists.txt, Makefile | C/C++ |
package.json | Node.js |
2nd signal — File extension distribution in src/:
Glob for src/**/*.rs, src/**/*.py, src/**/*.r, src/**/*.jl, src/**/*.cpp, etc.
Note the count and dominant extension.
Priority rule: If research_plan.md frontmatter says languages: ["python"] but src/
contains Cargo.toml and .rs files, the actual files win. Announce the discrepancy.
Check execution results:
results/pre_execution_status.json. If present, parse JSON and check the state field.pre_execution_status.json does not exist but pre_execution_status.md does (legacy workspace), read the .md and infer state from its content (look for SUCCESS/FAILED/PARTIAL/EXISTING).SUCCESS or EXISTING → integration tests and visualizations may use results/ data.FAILED, PARTIAL, or file absent → integration tests must be skipped; use mocks/inline data.Prepare a workspace summary:
results/ data is availableConsult Gemini for test suggestions, providing the workspace context:
mcp__gemini-cli__ask-gemini(
prompt: "Given the following research implementation, suggest a comprehensive test strategy.\n\nWorkspace context:\n- Detected languages/ecosystems: {detected}\n- results/ status: {SUCCESS|FAILED|ABSENT}\n\nResearch plan:\n@{output_dir}/plan/research_plan.md\n\nSource files:\n@{output_dir}/src/*\n\nPre-execution status (if available):\n@{output_dir}/results/pre_execution_status.json\n\nSuggest tests in two tiers:\n1. Unit tests (no results/ dependency — use mocks/fixtures; must run even without pre-execution)\n2. Integration/validation tests (may depend on results/ artifacts; mark as skippable if results/ absent)\n\nRecommend appropriate testing tools for the detected languages. Do not prescribe a single framework — choose what fits the codebase.",
model: "gemini-3.1-pro-preview"
)
If
--claude-only: Per §SubagentExec — A (CD, test strategist): Read research plan, source files, pre_execution_status. Suggest 2-tier test strategy (unit + integration) with appropriate tools for detected languages. Return structured text.
Synthesize the test plan, keeping two tiers explicit:
results/results/ is available):
results/ is absentPresent the test plan to the user for approval/modifications.
Choose test tooling based on detected workspace (not prescribed):
| Language | Test Framework |
|---|---|
| Python | pytest |
| Rust | cargo test |
| R | testthat |
| Julia | Test.jl |
| C/C++ | ctest / Catch2 |
Write Tier 1 unit tests:
Write Tier 2 integration tests, guarded by availability of results/:
import pytest, pathlib, json
_status_json = pathlib.Path("results/pre_execution_status.json")
_status_md = pathlib.Path("results/pre_execution_status.md") # legacy fallback
RESULTS_AVAILABLE = _status_json.exists() or _status_md.exists()
@pytest.mark.skipif(not RESULTS_AVAILABLE, reason="results/ not available — run /research-execute first")
def test_output_schema():
...
Run all tests and report results:
results/) are expected and acceptable.Fix failing Tier 1 tests or flag them for user attention if the fix requires code changes in src/.
Create plots/ directory if it doesn't exist.
Choose visualization tooling based on detected workspace:
results/*.csv / *.json| Language | Visualization |
|---|---|
| Python | matplotlib + scienceplots (['science', 'nature'] style) |
| R | ggplot2 |
| Julia | Plots.jl or Makie.jl |
IMPORTANT: For Python visualizations,
scienceplotswith['science', 'nature']style is required, not optional. Do NOT use manualplt.rcParamsoverrides — they conflict with scienceplots defaults.Technical note (
text.usetex): The['science', 'nature']style enablestext.usetex=True. All text in plots (axis labels, annotations, titles, legends) must be ASCII or LaTeX-escaped. Unicode characters likeπ,²,⁴,≈will causeRuntimeError. User'$\pi$',r'$^2$',r'$^4$',r'$\approx$'instead.Figure sizing: Use Nature column widths — single column: 3.5 in, double column: 7.2 in.
Data source selection:
results/ status SUCCESS or EXISTING: load data from results/ for plotsresults/ status FAILED or PARTIAL: use available partial data; note incomplete sections in captionsresults/ absent: compute all plot data inline within the visualization scriptGenerate visualizations appropriate to the research:
# Python example
fig.savefig('plots/{name}.png', dpi=300, bbox_inches='tight')
fig.savefig('plots/{name}.pdf', bbox_inches='tight')
After all plots are generated, create or update plots/plot_manifest.json. This file is the
mandatory interface to Phase 5 (Report). Its schema is fixed regardless of what tool generated
the plots.
For each plot, collect:
{
"plot_id": "descriptive_snake_case_name",
"files": {
"png": "plots/{name}.png",
"pdf": "plots/{name}.pdf"
},
"description": "One-sentence description of what the plot shows",
"section_hint": "results | methodology | validation | comparison | testing",
"caption": "Publication-ready figure caption (2-3 sentences). Include key quantitative findings.",
"markdown_snippet": "",
"source_context": "What code/data generated this plot (include results/ file path if applicable)",
"style": ["science", "nature"],
"dpi": 300,
"source_script": "plots/plot_{name}.py",
"source_function": "plot_{name}",
"generation_date": "YYYY-MM-DDTHH:MM:SS+00:00"
}
The style, dpi, source_script, source_function, and generation_date fields are required for downstream report generation (Phase 5). source_script must be an actual file path to the Python script that generated the plot.
Note: The
source_contextfield provides a human-readable description. Thesource_scriptandsource_functionfields provide machine-readable paths for automated style validation by the Report phase (Step 0.5). Both must be populated.
Write the complete manifest to plots/plot_manifest.json:
{
"generated_at": "YYYY-MM-DD HH:MM",
"total_plots": N,
"plots": [ ...entries... ]
}
Section hint vocabulary (controlled — Phase 5 uses this for placement):
results — Key findings, main experimental outcomesmethodology — Algorithmic diagrams, data pipeline illustrationsvalidation — Comparison with known/expected values, error analysiscomparison — Baseline vs. proposed, ablation studiestesting — Test coverage, pass/fail distributions, edge case behaviorCaption guidelines:
Before presenting to the user, execute a lightweight quality checkpoint:
| Checklist Item | Criteria |
|---|---|
| Tier 1 coverage | All major functions have unit tests; no significant gaps |
| Edge case handling | Boundary conditions, degenerate inputs, and error paths are tested |
| Tier 2 status | Integration tests present; skipped with clear reason if results/ absent |
| Common Restrictions | plot_manifest.json present with all required fields (including style, dpi, source_script, source_function, generation_date), all plots in PNG + PDF/SVG, dependency spec updated |
| Style compliance (Python) | All Python-generated plots use scienceplots ['science', 'nature'] style; no Unicode in labels; Nature column widths |
| Result reproducibility | Tests are deterministic or use fixed seeds |
Conditional MAGI mini-review (if confidence is Medium or Low):
mcp__codex-cli__ask-codex(
prompt: "Review these research tests and visualizations. Focus on: {low_scoring_items}\n\n@{output_dir}/plots/plot_manifest.json\n@{output_dir}/tests/ (or equivalent test directory)\n@{output_dir}/results/pre_execution_status.json",
model: "gpt-5.4"
)
If
--claude-only: Per §SubagentExec — B (AC, test reviewer): Read plot manifest, test files, pre_execution_status. Review focusing on {low_scoring_items}. Return structured text.
Write gate report to tests/phase_gate.md.
If the gate returns No-Go, fix identified issues before presenting. Maximum 1 fix iteration.
Present to the user:
uv add, cargo add, etc.) before using it.plot_manifest.json regardless of which language generated the plots.uv add SciencePlots.style field in the manifest should describe the equivalent style used (e.g., ["ggplot2-publication"]). Report-level scienceplots enforcement applies only to matplotlib-generated plots; non-Python plots are exempt from scienceplots but must still meet the Common Restrictions (dual format, 300 dpi, proper labeling).