From magi-researchers
Orchestrates full research pipeline from Brainstorming to Reporting via Planning, Implementation, Testing & Visualization phases with user checkpoints. Configurable for physics, AI/ML, statistics, math domains, depth, and agent personas.
npx claudepluginhub axect/magi-researchers --plugin magi-researchersThis skill uses the workspace's default tool permissions.
Runs the complete research pipeline: Brainstorming → Planning → Implementation → Testing & Visualization → Reporting. Orchestrates all phases with user checkpoints between each.
Orchestrates research workflows from question definition to evidence-based findings documentation for technical, requirements, literature, and codebase topics.
Executes multi-agent research pipeline on any topic with Scout, Investigators, Deep Diver, Verifier, Synthesizer, and Critic reviews to produce verified, sourced reports.
Orchestrates research workflows for technical questions, codebase patterns, requirements, and best practices with multi-source gathering, synthesis, and evidence-based reporting.
Share bugs, ideas, or general feedback.
Runs the complete research pipeline: Brainstorming → Planning → Implementation → Testing & Visualization → Reporting. Orchestrates all phases with user checkpoints between each.
/research "research topic" [--domain physics|ai_ml|statistics|mathematics|paper] [--weights '{"novelty":0.4,...}'|adaptive] [--depth low|medium|high|max] [--personas N] [--claude-only] [--substitute "Gemini -> Opus"] [--resume <output_dir>]
$ARGUMENTS — The research topic (required) and optional flags:
--domain — Research domain (physics, ai_ml, statistics, mathematics, paper). Auto-inferred if omitted.--weights — Scoring mode for direction ranking:
adaptive: Claude analyzes the prompt and recommends weights for user confirmation.
See /research-brainstorm for full details.--depth — Controls brainstorm review depth (default: medium):
low — Skip cross-review, go directly to synthesis (fastest, lowest cost)medium — Standard one-shot cross-review (default)high — Cross-review + adversarial debate (most thorough, highest cost)max — Hierarchical MAGI-in-MAGI: N persona subagents run parallel mini-MAGI pipelines, then meta-review + adversarial debate across all perspectives (deepest, highest cost)--personas N|auto — Number of domain-specialist subagents for --depth max (default: auto, range: 2-4). When auto, Claude analyzes the topic to determine the optimal persona count. Ignored for other depth levels.--claude-only — Replace all Gemini/Codex MCP calls with Claude Agent subagents across all phases. Use when external model endpoints are unavailable. Forwarded to all sub-skills automatically.--substitute "Agent -> Opus" — Replace a specific MAGI agent with Claude (Opus). Accepted: "Gemini -> Opus", "Codex -> Opus". Can be specified multiple times. Forwarded to all sub-skills. If both agents are substituted, equivalent to --claude-only. Mutually exclusive with --claude-only (--claude-only takes precedence).--resume <output_dir> — Resume an interrupted pipeline from a previous output directory. See Resume Protocol below.Shared rules: Read
${CLAUDE_PLUGIN_ROOT}/shared/rules.mdbefore starting. §MCP, §Visualization, §LaTeX, §PhaseGate, §Substitute apply to this skill. Inline fallback (if shared rules unavailable): Gemini models: gemini-3.1-pro-preview → gemini-2.5-pro → Claude. Codex: gpt-5.4. All math in LaTeX only (no Unicode). scienceplots['science','nature'], 300dpi PNG+PDF, Nature widths (3.5/7.2in). Use@filepathfor MCP file refs; subagents useReadtool.
See §MCP, §Visualization, §LaTeX in shared rules. Additionally:
mcp__codex-cli__brainstorm for ideation, mcp__codex-cli__ask-codex for analysis/review.ALL artifacts MUST be written under {output_dir}/. Never write files directly to the project root or any path outside outputs/.
{output_dir} is the absolute path stored in .workspace.json at the output directory root.{output_dir} is uncertain (e.g., after context compression), recover it:
Glob for outputs/*/.workspace.json, select the most recently modified match.Read the file and extract output_dir.{output_dir} path.When this skill is invoked, execute the full research pipeline below. Always pause for user confirmation between phases.
See §PhaseGate in shared rules. Phase-specific checklists:
| Phase | Checklist Items |
|---|---|
| Plan | Completeness (all objectives addressed), methodology soundness, resource feasibility, risk identification |
| Implement | Code correctness, alignment with plan, error handling, dependency management |
| Execute | Exit code 0, results/ populated (or EXISTING/PARTIAL with user acknowledgment), pre_execution_status.json written |
| Test | Tier 1 unit test coverage, edge case handling, Common Restrictions fulfilled (plot_manifest.json, dual format, dependency spec), result reproducibility |
When --resume <output_dir> is provided, the pipeline skips initialization and infers the current phase from the presence of key artifact files in the output directory. This avoids requiring the LLM to maintain a separate state file — the artifacts themselves serve as checkpoints.
Unified semantics:
--resumealways accepts the workspace root directory (e.g.,outputs/topic_20260309_v1/). Whether resuming a research pipeline or a write sub-pipeline, the same root path is used. The system infers the correct sub-context internally.
Phase inference rules (evaluated top-down; first match wins):
| Condition | Inference | Action |
|---|---|---|
report.md exists | Pipeline complete | Inform user; offer to re-run specific phases |
plots/plot_manifest.json exists | Test phase complete | Resume from the Report phase |
results/pre_execution_status.json exists (or pre_execution_status.md for legacy v0.8.x) | Execute phase complete | Resume from the Test phase |
src/ contains at least one source file | Implement phase complete | Resume from the Execute phase |
plan/research_plan.md exists | Plan phase complete | Resume from the Implement phase |
brainstorm/synthesis.md exists | Brainstorm phase complete | Resume from the Plan phase |
| None of the above | No phase complete | Start from the Brainstorm phase |
Checkpoint hash validation (when checkpoint files exist):
After determining the resume phase from artifact presence, check if a checkpoint file exists for the detected completed phase (e.g., plan/plan_checkpoint.json). If found:
input_hashes fieldCheckpoint validation is additive — if no checkpoint file exists (e.g., from a pre-v0.9.0 run), fall back to the existing artifact-presence inference.
Resume procedure:
{output_dir}/.workspace.json to restore the absolute output path and metadata.Glob tool to check for each artifact in the order above.brainstorm/personas.md or brainstorm/weights.json exist (to restore context).Important: On resume, do NOT re-create the output directory or overwrite existing artifacts. Append or create only the artifacts for the resumed phase and beyond.
Before starting each phase (2 through 5), verify that the required predecessor artifacts exist and are non-empty. Use the Glob and Read tools for deterministic, tool-based validation — do not rely on memory or assumptions.
Required artifacts per phase:
| Phase | Required Artifacts | Validation Method |
|---|---|---|
| Plan | brainstorm/synthesis.md | Glob + Read first 3 lines (non-empty) |
| Implement | plan/research_plan.md | Glob + Read first 3 lines (non-empty) |
| Execute | At least one source file in src/, plan/research_plan.md with execution_cmd in frontmatter | Glob src/**/* + Read frontmatter |
| Test | At least one source file in src/, plan/research_plan.md | Glob src/**/* + Glob for plan. Note: results/pre_execution_status.json is optional — its absence means Tier 2 integration tests will be skipped (fall back to .md for legacy) |
| Report | brainstorm/synthesis.md, plan/research_plan.md, at least one source file in src/, plots/plot_manifest.json | Glob for each path |
On validation failure:
$ARGUMENTS:
--domain is specified; otherwise infer from topic keywords--resume <output_dir>: If provided, skip steps 2-6 and execute the Resume Protocol above. The pipeline will jump directly to the inferred phase.outputs/{sanitized_topic}_{YYYYMMDD}_v{N}/
├── .workspace.json # Workspace anchor (absolute path)
├── brainstorm/
├── plan/
├── src/
├── tests/
└── plots/
outputs/{sanitized_topic}_{YYYYMMDD}_v*/ and set N = max existing + 1 (start at v1).workspace.json at the output directory root:
{
"output_dir": "{absolute_path_to_output_dir}",
"topic": "{original_topic}",
"domain": "{domain}",
"created_at": "{ISO-8601 timestamp}"
}
Use pwd or equivalent to resolve the absolute path. This file anchors all subsequent file writes across all phases.${CLAUDE_PLUGIN_ROOT}/templates/domains/, read it as context.--weights: If provided, validate and store. If omitted, domain defaults will be used by the brainstorm sub-skill.--depth: Accept low, medium (default), high, or max.--personas N|auto: Accept integer 2-4 or the string auto (default: auto). Only used when --depth max; ignored otherwise.
auto: Defer persona count determination to the Brainstorm phase (sub-skill Step 0b), where Claude analyzes the topic to select the optimal N.--claude-only: Boolean flag (default: false). When present, all Gemini/Codex MCP calls across all phases are replaced with Claude Agent subagents. This flag is forwarded to every sub-skill invocation.--substitute: Accept zero or more --substitute "Agent -> Opus" flags. Valid agent names: Gemini, Codex. Valid target: Opus. Forwarded to every sub-skill invocation. Mutually exclusive with --claude-only (--claude-only takes precedence). If both agents are substituted, treat as --claude-only.Flag forwarding:
--claude-onlyand--substituteflags are forwarded to every sub-skill invocation (brainstorm, plan, implement, execute, test, report). Each sub-skill applies the replacement rules to its own MCP calls.
max; show auto if no explicit --personas was given), claude-only mode (if active), and agent substitutions (if any).Execute the /magi-researchers:research-brainstorm workflow, forwarding all flags: --domain, --weights, --depth, --personas (only when --depth max), --claude-only (if active), and --substitute (if any).
Step 0 — Setup:
brainstorm/weights.jsonStep 0a — Research Direction Document (Phase 1-4):
brainstorm/research_direction.mdStep 0b — Persona Casting (informed by research_direction.md):
brainstorm/personas.mdStep 0c — Adaptive Weights (if --weights adaptive):
research_direction.md to recommend scoring weights for user confirmationbrainstorm/weights.jsonStep 0d — Pre-flight (persona-targeted searches):
brainstorm/research_direction.md (Sections 5 and 5a) as baseline contextbrainstorm/preflight_context.md, per-persona briefing filesStep 1a — Parallel Brainstorming (with personas):
brainstorm/gemini_ideas.md and brainstorm/codex_ideas.mdStep 1b — Cross-Check (--depth medium|high):
brainstorm/gemini_review_of_codex.mdbrainstorm/codex_review_of_gemini.md--depth lowStep 1b+ — Adversarial Debate (--depth high only):
brainstorm/debate_round2_gemini.md, brainstorm/debate_round2_codex.mdSteps 1-max-a~d — Hierarchical MAGI-in-MAGI (--depth max only):
brainstorm/persona_{i}/brainstorm/meta_review_*.md, brainstorm/meta_debate_*.mdbrainstorm/synthesis.mdStep 1c — Synthesis (with weighted scoring):
weights.json to rank directionsbrainstorm/synthesis.md with weighted scores and debate resolution (if applicable)Checkpoint: Emit brainstorm/brainstorm_checkpoint.json:
{
"schema_version": "1.0.0",
"phase": "brainstorm",
"completed_at": "{ISO-8601 timestamp}",
"input_hashes": {},
"output_artifacts": ["brainstorm/synthesis.md", "brainstorm/gemini_ideas.md", "brainstorm/codex_ideas.md"]
}
>>> USER CHECKPOINT: Confirm research direction <<<
Artifact Contract: Verify brainstorm/synthesis.md exists and is non-empty (Glob + Read first 3 lines). On failure, follow the Artifact Contract Protocol above.
Step 2a — Plan Drafting:
plan/research_plan.md, beginning with a YAML frontmatter block:
---
title: "{research topic}"
domain: "{physics|ai_ml|statistics|mathematics|paper}"
languages: ["{primary language(s) planned}"]
ecosystem: ["{package manager(s) planned}"]
execution_cmd: "" # filled in after the Implement phase
dry_run_cmd: "" # filled in after the Implement phase
expected_outputs: [] # filled in after the Implement phase
estimated_runtime: "" # filled in after the Implement phase
---
Leave execution fields empty for now — the Implement phase will populate them.Step 2b — Murder Board:
Submit the research plan to Gemini as a hostile reviewer to stress-test for critical flaws:
mcp__gemini-cli__ask-gemini(
prompt: "You are a hostile but fair research reviewer. Your job is to find fatal flaws in this research plan — flaws that would cause the research to fail, produce invalid results, or waste significant effort.\n\nAttack the plan on these dimensions:\n1. **Methodological flaws**: Are there fundamental errors in the proposed approach?\n2. **Missing assumptions**: What unstated assumptions could invalidate results?\n3. **Scalability risks**: Will this approach break on realistic problem sizes?\n4. **Data/resource gaps**: Are required datasets, compute, or libraries actually available?\n5. **Novelty concerns**: Has this exact approach been tried and failed before?\n\nFor each flaw found, rate its severity (Critical/Major/Minor) and explain the likely failure mode.\n\nResearch Plan:\n@{output_dir}/plan/research_plan.md",
model: "gemini-3.1-pro-preview" // fallback chain applies
)
Save to plan/murder_board.md.
If
--claude-onlyor Gemini is substituted: Per §SubagentExec — Adversarial-Critical reviewer: Readresearch_plan.md. Attack on methodological flaws, missing assumptions, scalability risks, data/resource gaps, novelty concerns. Save toplan/murder_board.md.
Step 2c — Mitigations:
Claude reviews each flaw from the murder board and documents a mitigation strategy:
High, Medium, LowLow confidence, perform one revision pass: update the relevant section of research_plan.md and re-assess.plan/mitigations.md.Phase Gate: Plan — Execute the Phase Gate Protocol with Plan checklist.
Checkpoint: Emit plan/plan_checkpoint.json:
{
"schema_version": "1.0.0",
"phase": "plan",
"completed_at": "{ISO-8601 timestamp}",
"input_hashes": {
"brainstorm/synthesis.md": "sha256:{hash}"
},
"output_artifacts": ["plan/research_plan.md", "plan/murder_board.md", "plan/mitigations.md", "plan/phase_gate.md"]
}
>>> USER CHECKPOINT: Approve research plan <<< Present to user: plan summary, murder board highlights, mitigations, and gate result.
Artifact Contract: Verify plan/research_plan.md exists and is non-empty (Glob + Read first 3 lines). On failure, follow the Artifact Contract Protocol above.
Execute the /magi-researchers:research-implement workflow:
research_plan.md to implement code in src/Phase Gate: Implement — Execute the Phase Gate Protocol with Implement checklist.
Checkpoint: Emit src/implement_checkpoint.json:
{
"schema_version": "1.0.0",
"phase": "implement",
"completed_at": "{ISO-8601 timestamp}",
"input_hashes": {
"plan/research_plan.md": "sha256:{hash}"
},
"output_artifacts": ["src/phase_gate.md"]
}
>>> USER CHECKPOINT: Review implementation <<<
Artifact Contract: Verify at least one source file in src/ and plan/research_plan.md with
execution_cmd in frontmatter. On failure, follow the Artifact Contract Protocol above.
Execute the /magi-researchers:research-execute workflow:
execution_cmd and dry_run_cmd from plan/research_plan.md YAML frontmatterresults/run_log.txtresults/pre_execution_status.json (state: SUCCESS / FAILED / PARTIAL / EXISTING)Phase Gate: Execute — Execute the Phase Gate Protocol with Execute checklist.
Checkpoint: Emit results/execute_checkpoint.json:
{
"schema_version": "1.0.0",
"phase": "execute",
"completed_at": "{ISO-8601 timestamp}",
"input_hashes": {
"plan/research_plan.md": "sha256:{hash}",
"src/": "sha256:{combined_hash}"
},
"output_artifacts": ["results/pre_execution_status.json", "results/run_log.txt"]
}
>>> USER CHECKPOINT: Review execution results <<< Present: execution status, generated artifacts, any errors encountered.
Artifact Contract: Verify at least one source file exists in src/ (Glob src/**/*) and
plan/research_plan.md exists. results/pre_execution_status.json is optional — its absence causes
Tier 2 integration tests to be skipped (not a pipeline failure). Fall back to .md for legacy workspaces.
Execute the /magi-researchers:research-test workflow:
Step 0 — Workspace Detection:
src/ for package managers and file extensions to detect languages and ecosystemsresults/pre_execution_status.json to determine data availability for Tier 2 testsStep 1 — Test Strategy Discussion:
Step 2 — Test Implementation:
results/ availability)Step 3 — Visualization:
results/ if available; compute inline otherwiseplot_manifest.jsonStep 4 — Plot Manifest:
plots/plot_manifest.json using the fixed schemaPhase Gate: Test — Execute the Phase Gate Protocol with Test checklist.
Checkpoint: Emit tests/test_checkpoint.json:
{
"schema_version": "1.0.0",
"phase": "test",
"completed_at": "{ISO-8601 timestamp}",
"input_hashes": {
"src/": "sha256:{combined_hash}",
"results/pre_execution_status.json": "sha256:{hash} (omit key if file absent)"
},
"output_artifacts": ["plots/plot_manifest.json"]
}
If results/pre_execution_status.json does not exist, omit it from input_hashes rather than storing a placeholder.
>>> USER CHECKPOINT: Review test results and visualizations <<<
Artifact Contract: Verify all of the following exist (Glob for each): brainstorm/synthesis.md, plan/research_plan.md, at least one source file in src/, and plots/plot_manifest.json. On failure, follow the Artifact Contract Protocol above.
If --depth max was used: also check for brainstorm/all_conclusions.md, brainstorm/meta_review_gemini.md, brainstorm/meta_review_codex.md, brainstorm/meta_debate_gemini.md, brainstorm/meta_debate_codex.md (these replace debate_round2_*.md files)
Execute the /magi-researchers:research-report workflow:
Step 0 — Gather & Health Check:
plots/plot_manifest.json (create if missing but plots exist)Step 0.5 — Plot Style Validation:
['science', 'nature'] style)plots/plot_manifest.json with style metadataStep 1 — Content Assembly & Plot Mapping:
section_hint tagsStep 2 — Report Draft with Integrated Plots:
report.md using ${CLAUDE_PLUGIN_ROOT}/templates/report_template.md structureStep 3 — Gap Detection & Plot Generation Loop (max 2 iterations):
plots/ → update manifest)Step 4 — MAGI Traceability Review (parallel cross-verification):
brainstorm/personas.md exists, prepend the assigned personas to Gemini and Codex review prompts for continuity--claude-only or --substitute is active, substituted agents use Claude Agent subagents with their respective cognitive styles (Creative-Divergent for Gemini, Analytical-Convergent for Codex). Non-substituted agents use their MCP tools normally.Step 5 — Write Final Report:
report.md with version tracking (report_versions.json)1.1.0; each version entry includes a structured changes array tracking what was modifiedStep 6 — Feedback Loop:
>>> USER CHECKPOINT: Review and finalize report <<<
Step R1: Checkpoint
Present report location, version, any Tier 1/2 revisions already applied. Options:
Step R2: Pipeline Re-entry (max 2 iterations)
Classify re-entry point from feedback:
| Trigger | Re-entry Phase | Phases to re-run |
|---|---|---|
| Methodology/approach change | Plan | Plan → Implement → Execute → Test → Report |
| Code bug, new analysis | Implement | Implement → Execute → Test → Report |
| Different parameters, rerun | Execute | Execute → Test → Report |
| New visualization strategy | Test | Test → Report |
Present re-entry plan to user for confirmation.
On confirmation:
a. Archive report.md → report_v{N}.md, update report_versions.json
b. Delete report.md (reset resume protocol)
Rollback note: If the re-entry pipeline fails before a new
report.mdis produced, inform the user that the previous report is preserved asreport_v{N}.mdand can be restored by copying it back toreport.md. c. Execute pipeline from re-entry phase d. Report skill generates new version → return to Step R1
Announce completion with:
--resume <output_dir> to resume an interrupted pipeline from the last completed phase