From magi-researchers
Executes research code in src/ to generate artifacts in results/, reading commands from plan/research_plan.md YAML frontmatter or execution_manifest.json. Phase 3.5 of research pipeline with prerequisite checks.
npx claudepluginhub axect/magi-researchers --plugin magi-researchersThis skill uses the workspace's default tool permissions.
Executes the research code in `src/` to generate result artifacts in `results/`. This is Phase 3.5
Implements research code from research_plan.md in outputs directories. Locates plan, detects language/ecosystem from src/ or frontmatter, sets up workspace, uses MCP for implementation.
Executes smoke tests or documented inference/evaluation commands in AI repos, captures evidence, normalizes into repro_outputs/ files, and adds patch notes for repo changes.
Orchestrates end-to-end Research-Plan-Implement pipeline using parallel subagents per phase with file artifacts for communication. Use for complex features needing codebase research, planning, and implementation in one session.
Share bugs, ideas, or general feedback.
Executes the research code in src/ to generate result artifacts in results/. This is Phase 3.5
of the research pipeline, sitting between Implementation (Phase 3) and Testing & Visualization (Phase 4).
Reads execution commands deterministically from the YAML frontmatter of plan/research_plan.md — no
keyword heuristics, no entry-point guessing. The full run command is defined once during planning and
executed here.
/research-execute [path/to/output/dir]
$ARGUMENTS — Optional path to the research output directory. If not provided, uses the most recent outputs/*/ directory.When --claude-only is active, there are no Gemini/Codex calls in this skill. All steps are
performed by Claude directly.
$ARGUMENTS or most recent outputs/*/).plan/research_plan.md and parse the YAML frontmatter:
---
languages: ["rust", "python"]
ecosystem: ["cargo", "uv"]
execution_cmd: "bash run_all.sh"
dry_run_cmd: "bash run_all.sh --dry-run"
expected_outputs:
- "results/metrics.csv"
- "results/checkpoint.pt"
estimated_runtime: "~30 minutes"
---
execution_manifest.json: Check if execution_manifest.json exists in the output directory root. If it does, read execution fields from this file instead of the YAML frontmatter:
{
"schema_version": "1.0.0",
"languages": ["rust", "python"],
"ecosystem": ["cargo", "uv"],
"execution_cmd": "bash run_all.sh",
"dry_run_cmd": "bash run_all.sh --dry-run",
"expected_outputs": [
{"path": "results/metrics.csv", "required": true},
{"path": "results/checkpoint.pt", "required": false}
],
"estimated_runtime": "~30 minutes"
}
If execution_manifest.json exists, it takes precedence over YAML frontmatter fields. If it does not exist, fall back to the YAML frontmatter (backward compatibility).execution_cmd: announce the problem to the user and
ask them to provide the execution command manually. Do not guess. Suggest adding the frontmatter
to research_plan.md following the schema above.src/ exists and contains at least one file.Check if results/ already exists and contains at least one file that is not run_log.txt,
pre_execution_status.json, or pre_execution_status.md (legacy):
Glob: results/**/*
Exclusion: Exclude results/.staging/ from the existence check. Files under .staging/ are incomplete and must not trigger the 'results already exist' early-exit path.
If populated:
src/ and plan/research_plan.md. Compare against hashes stored in results/.source_hashes.json (if it exists).
"results/ already contains artifacts and source code is unchanged. Skipping re-execution.".source_hashes.json is missing: Announce: "results/ contains artifacts but source code has changed since they were generated. Re-execution recommended." Ask the user: "(a) Re-execute with current code, or (b) Keep existing results?"results/pre_execution_status.json (if not already present) with the canonical EXISTING schema:
{
"state": "EXISTING",
"error_class": null,
"severity": null,
"retryable": false,
"downstream_allowed": true,
"traceback_ref": null,
"next_action": "proceed"
}
If dry_run_cmd is specified in the frontmatter, run it first as a fast sanity check:
{dry_run_cmd} 2>&1 | tee results/dry_run_log.txt
Timeout: 60 seconds.
| Outcome | Action |
|---|---|
| Exit 0 | Continue to Step 3 |
| Non-zero exit | Read results/dry_run_log.txt, extract the traceback |
| Timeout | Kill process; report to user; ask whether to proceed to full run anyway |
On dry-run failure:
results/ subdirectory, simple import): attempt one auto-fix, re-run dry-run.results/pre_execution_status.json. Do NOT attempt auto-fix. Report to user with full traceback and recommend investigating the root cause before retrying.If dry_run_cmd is not specified, skip this step and proceed directly to Step 3.
Before executing the full run, announce:
Ready to execute:
Command: {execution_cmd}
Estimated runtime: {estimated_runtime or "unknown"}
Output will be captured to: results/run_log.txt
Pause for user confirmation before running.
If estimated_runtime suggests a long job (> 15 minutes), add:
⚠ This job may take a long time. If you prefer to run it manually:
1. Run externally: {execution_cmd}
2. Copy results to the `results/` directory, then call `/research-execute [output_dir]` — the skill will detect existing results and skip re-execution automatically (Step 1 Early Exit).
Wait for explicit user confirmation before executing.
Create results/ directory if it does not exist, then run:
Manifest overrides: If execution_manifest.json was loaded in Step 0:
cwd is specified, cd to that directory before executing the command.env is specified (object of key-value pairs), prepend each as environment variable exports to the command (e.g., FOO=bar BAZ=qux {execution_cmd}).timeout_override_ms is specified, use it instead of the default 30-minute timeout below.Command validation: Before executing, inspect execution_cmd for shell metacharacters (;, &&, ||, |, $(, `, >, <, &). If any are found beyond simple pipes to tee, warn the user and require explicit confirmation before proceeding. Validate that the cwd field (if present) is a relative subdirectory path with no .. traversal and that the directory exists.
Execute in an isolated process group to prevent orphaned child processes on timeout:
setsid bash -c '{execution_cmd} 2>&1 | tee results/run_log.txt'
Timeout: 30 minutes (adjust based on estimated_runtime if provided and > 30 min, or timeout_override_ms from manifest).
On timeout — staggered teardown:
kill -TERM -$PGIDkill -KILL -$PGID"EXECUTION TIMED OUT. Process group terminated." to results/run_log.txtAtomic results staging: To prevent half-written results from triggering false early-exit on subsequent runs:
mkdir -p results/.staging/RESULTS_DIR=results/.staging/ (if the execution script respects it) or configure output paths to write to results/.staging/results/.staging/ to results/: mv results/.staging/* results/rmdir results/.staging/results/.staging/ (they will not trigger early-exit in Step 1)pre_execution_status.json that partial results are in .staging/Note: If the execution script writes directly to paths that cannot be redirected, skip atomic staging and document this in the run log.
| Outcome | Detection | Action |
|---|---|---|
| Success | Exit code 0 | Continue to Step 5 |
| Runtime error | Non-zero exit | Step 4-FAIL path |
| Timeout | Still running | Kill process → Step 4-TIMEOUT path |
Step 4-FAIL path (non-zero exit):
results/run_log.txt and extract the final traceback.results/pre_execution_status.json. Do NOT attempt auto-fix. Report to user with full traceback and recommend investigating the root cause before retrying.Step 4-TIMEOUT path:
"EXECUTION TIMED OUT." to results/run_log.txt.results/ contains so far.
Also check results/.staging/ — if atomic staging was active, partial artifacts may be there instead of results/. Include both locations in the inventory presented to the user.results/pre_execution_status.json → proceed to Step 4-PARTIAL.Step 4-PARTIAL (failure or timeout):
Write results/pre_execution_status.json:
{
"state": "FAILED | PARTIAL",
"error_class": "dependency|compilation|runtime|timeout|resource|fatal|unknown",
"severity": "recoverable|blocking|fatal",
"retryable": true,
"downstream_allowed": true | false,
"traceback_ref": "results/run_log.txt",
"next_action": "retry|abort|user_decision"
}
Choose the appropriate state, error_class, and severity based on the failure mode:
"state": "PARTIAL", "error_class": "timeout""state": "PARTIAL""state": "FAILED"
Set "downstream_allowed": true if partial artifacts exist that downstream phases can use, false if nothing usable was produced. Set "retryable": true for transient failures (timeout, resource), false for deterministic failures (compilation, logic).Announce clearly to the user. Do NOT block the pipeline — proceed to Step 6.
If execution succeeded:
Glob results/**/* and categorize by extension.
If expected_outputs is specified in frontmatter, verify each file exists. Report any missing ones.
Silent failure detection: If exit code was 0 but one or more required: true expected outputs are missing, treat this as state PARTIAL (not SUCCESS). Write pre_execution_status.json with "state": "PARTIAL", "error_class": "silent_failure", "severity": "recoverable", and "downstream_allowed": true. Announce the discrepancy to the user.
Write results/pre_execution_status.json:
{
"state": "SUCCESS",
"error_class": null,
"severity": null,
"retryable": false,
"downstream_allowed": true,
"traceback_ref": "results/run_log.txt",
"next_action": "proceed"
}
Note for downstream consumers: When reading
pre_execution_status.json, always check thestatefield value — do not treat file existence alone as an indicator of success. Seeresearch-testStep 0 for the correct guard logic.
Legacy fallback: If
pre_execution_status.mdexists (legacy v0.8.x workspace), read it and treat any line containing SUCCESS/FAILED/PARTIAL/EXISTING as the state. New runs always write.json.
Save source fingerprints for future staleness detection:
src/ and plan/research_plan.mdresults/.source_hashes.json:
{
"generated_at": "ISO-8601 timestamp",
"execution_cmd": "{execution_cmd}",
"hashes": {
"src/main.py": "sha256:abc123...",
"plan/research_plan.md": "sha256:def456..."
}
}
Announce success with the artifact summary.
Present to the user:
{execution_cmd}/research-test) when readyresults/ before calling this skill. Step 1 (Early Exit) will detect
the populated results/ and skip re-execution automatically.src/ files during this phase. If errors require code changes, roll back to
Phase 3 (Implement).