Skill

plan

From research

Scans codebase, proposes metric/guard/agent config, writes program.md run spec, and profiles bottlenecks via cProfile.

Python

ai-ml

performance

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/research:plan <goal> | <file.py> [out.md] [--team]

User invocable

Model invocation disabled

Inline context

Effort: medium

Argument hint<goal> | <file.py> [out.md] [--team]

Tool Access

This skill is limited to the following tools:

ReadWriteEditBashGrepGlobAgentTaskCreateTaskUpdateAskUserQuestion

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

SKILL.md

278 lines · ~4.4k tokens

Stats

LanguagePython

Parent stars23

Parent forks3

MaintenanceExcellent

Last CommitJul 15, 2026

Actions

View Source View Plugin View on GitHub View README

Agent Resolution

Environment precondition — CLAUDE_PLUGIN_ROOT set automatically when skill runs via plugin manager. Fallback plugins/research resolves only from project root. Neither resolves (bare .claude/ copy invoked from subdirectory) → bin/ scripts return empty strings silently.

bin/ scripts this skill depends on (deployed inside ${CLAUDE_PLUGIN_ROOT}/bin/): resolve_shared.py, make_run_dir.py. Each call below followed by explicit empty-result guard — silent failure surfaces as fail-fast error, never empty-string path.

_RESEARCH_SHARED=$(python "${CLAUDE_PLUGIN_ROOT:-plugins/research}/bin/resolve_shared.py" 2>/dev/null)  # timeout: 5000
[ -z "$_RESEARCH_SHARED" ] && { echo "! Plugin path resolution failed — ensure research plugin installed and CLAUDE_PLUGIN_ROOT set, or invoke /research:plan from project root."; exit 1; }

Read $_RESEARCH_SHARED/agent-resolution.md. Contains: foundry check + fallback table. Foundry not installed → substitute each foundry:X with general-purpose per table. Agents this skill uses: foundry:solution-architect, foundry:perf-optimizer.

Plan Mode (Steps P-P0–P-P4)

Triggered by plan <goal|file>. Wizard configures run.

Task tracking: create tasks for P-P0, P-P1, P-P2, P-P2b, P-P3 at start; add P-P4 only if --team detected in arguments.

Unsupported flag check: follow $_RESEARCH_SHARED/unsupported-flag-protocol.md. Supported flags for this skill: --team.

Step P-P0: Detect input type

Parse <input> from arguments. Determine: file path or goal string:

Extract first positional token (strip all --<flag> tokens from $ARGUMENTS, take first remaining token as FILE_ARG). Then:

Disambiguation guard — treat FILE_ARG as file path only if it exists on disk. Multi-token strings ($ARGUMENTS containing spaces beyond FILE_ARG) always goal text — never run test -f on first token of multi-token goal:

Quoting note: $ARGUMENTS raw string (not shell-tokenized). User-supplied quotes (e.g. plan "reduce training loss") appear as literal characters. Before token counting, strip surrounding matched quotes from $ARGUMENTS so quoted multi-word goals correctly recognised as multi-token:

_STRIPPED=$(echo "$ARGUMENTS" | sed -E 's/^"(.*)"$/\1/; s/^'\''(.*)'\''$/\1/')  # timeout: 5000
NONFLAG_TOKEN_COUNT=$(echo "$_STRIPPED" | tr ' ' '\n' | grep -v '^--' | grep -v '^$' | wc -l | tr -d ' ')  # timeout: 5000

NONFLAG_TOKEN_COUNT == 1 AND test -f "$FILE_ARG" succeeds → file path. FILE_ARG is script to profile. Enter profiling flow.
Otherwise (multi-token, or single token not on disk) → goal string. Use full $ARGUMENTS (minus flags) as <goal>. Skip to Step P-P1.

Profiling flow (file path detected):

Run baseline profiling using FILE_ARG only — never raw $ARGUMENTS in cProfile command.

Module-file guard: python -m cProfile requires executable script (has if __name__ == "__main__": guard or runs directly). Library/module files without entry point produce empty cProfile output. Pre-check:

grep -q '__main__' "$FILE_ARG" 2>/dev/null || { echo "⚠ File has no __main__ guard — cProfile will produce empty output. Falling back to goal-string path."; PROFILE_AVAILABLE=false; }

If PROFILE_AVAILABLE=false from above guard, skip the cProfile block below. Otherwise:

CPROFILE_OUT=$(mktemp -t research-plan-XXXX)  # timeout: 3000
python -m cProfile -s cumtime "$FILE_ARG" > "$CPROFILE_OUT" 2>&1  # timeout: 600000
PROFILE_EXIT=$?
if [ $PROFILE_EXIT -ne 0 ]; then
    echo "⚠ cProfile failed (exit $PROFILE_EXIT) — continuing without profile data. Wizard will prompt for goal string instead."
    PROFILE_AVAILABLE=false
else
    PROFILE_AVAILABLE=true
    head -40 "$CPROFILE_OUT"  # timeout: 5000
    time python "$FILE_ARG"   # timeout: 600000
fi

Fallback path — ONLY when PROFILE_AVAILABLE=false: skip bottleneck selection menu. Invoke AskUserQuestion with options: (a) Provide goal — enter optimization goal string directly (cProfile unavailable — note: cProfile requires self-contained runnable script with if __name__ == '__main__' guard; modules and test files not supported); (b) Abort — stop. Use user's response as <goal>. Proceed directly to P-P1. Skip profile-available path entirely — do not read or execute following block.

Profile-available path — ONLY when PROFILE_AVAILABLE=true (skip entirely if fallback path taken above), present top up to 5 bottleneck functions (skip rows with no data available):

Top bottleneck functions:
1. <function> — <cumtime>s (<percentage>%)
2. <function> — <cumtime>s (<percentage>%)
...

Invoke AskUserQuestion — "What would you like to optimize?", options: (a) Overall execution time · (b) Memory usage · (c) Specific function: <top function name> (currently <time>s) · (d) Custom goal (describe).

Construct goal string from selection:

(a) → "Reduce wall-clock execution time of <file>"
(b) → "Reduce peak memory usage of <file>"
(c) → "Optimize <function> in <file> (currently <time>s)"
(d) → user's text

Set as <goal>, proceed to P-P1.

Step P-P1: Parse and scan

Scope guard (first action): Before scanning, check <goal> is optimization goal. Input clearly not optimization goal (code question, regex/algo explanation, debug question, any prompt without measurable improvement target) → invoke AskUserQuestion:

question: "This input does not look like an optimization goal (/research:plan expects 'Reduce X' / 'Increase Y' / 'Improve Z metric'). How to proceed?"
(a) label: rephrase as optimization goal — description: provide revised goal with measurable improvement target
(b) label: abort — description: stop; use /research for explanatory questions

Stop if user selects (b). Never proceed to P-P2 or P-P3 without valid optimization goal.

Parse <goal>. Scan codebase to detect:

Language and framework (Python, PyTorch, pytest, etc.)
Available test runners or benchmark scripts
Candidate metric commands (pytest coverage, benchmark scripts, eval scripts)
Candidate guard commands (test suite, lint, type check)
Files relevant to goal (scope files)

Step P-P2: Present proposed config

Present config as code block for review. Include:

metric_cmd:      [command that prints a single numeric result]
metric_direction: higher | lower
guard_cmd:       [command that must pass (exit 0) on every kept commit]
max_iterations:  [default 20]
agent_strategy:  [auto | perf | code | ml | arch]
scope_files:     [files the ideation agent may modify]
compute:         local | colab | docker

Dry-run both commands before presenting (add # timeout: 60000 to timed bash calls — user commands may run minutes; ML pipeline data-loading steps may exceed 60s — increase timeout or use guard_cmd dry-run only when metric dry-run slow). Failure → flag error, propose corrections, then invoke AskUserQuestion — (a) I fixed the command — re-run dry-run · (b) Proceed anyway (I know this command is correct) · (c) Abort. Never proceed to P-P3 without user confirmation after failure.

Step P-P2b: Agent validation (pre-write)

After user confirms, run expert agent review before writing program.md. Dispatch conditional on goal type — run whichever apply in parallel.

Foundry availability check — before dispatching any foundry:* agent: run find ~/.claude/plugins/cache -path "*/foundry*" -name "solution-architect.md" 2>/dev/null | head -1. Result empty: skip architecture and perf reviews entirely; print ⚠ foundry plugin not installed — skipping foundry:solution-architect and foundry:perf-optimizer reviews. Continuing without architecture/perf advisory.; record gap in advisory block as architect: skipped (foundry absent). Proceed to P-P3 with available advisor output (scientist only if ML keywords matched).

Pre-spawn — create plan run dir (review files share single timestamped dir):

PLAN_RUN_DIR=$(python "${CLAUDE_PLUGIN_ROOT:-plugins/research}/bin/make_run_dir.py" "plan" ".experiments" 2>/dev/null)  # timeout: 5000
[ -z "$PLAN_RUN_DIR" ] && { echo "! make_run_dir.py returned empty — research plugin path resolution failed"; exit 1; }

Synchronous spawn note: P-P2 advisors (architect, scientist, perf) spawned synchronously (not run_in_background=true), so CLAUDE.md §6 per-agent sentinel polling unreachable mid-call. Timeout handled post-hoc — after Agent() calls return, check each advisor's review file under $PLAN_RUN_DIR (plan-review-architect.md, plan-review-scientist.md, plan-review-perf.md). Any file missing or empty = that advisor timed out: surface with ⏱ and continue to P-P3 with remaining advisor output.

Architect gate — spawn foundry:solution-architect only when scope_files contains >1 file OR agent_strategy = arch. Single-file optimization goals skip architect (no architectural surface to validate; saves ~5–10 min opus-tier compute). Record skip reason in advisory block as architect: skipped (single-file scope).

When gate fires, before constructing the Agent() call, substitute the actual computed value of $PLAN_RUN_DIR into the prompt string (e.g. .experiments/plan-2026-05-13T10-00-00Z):

Agent(subagent_type="foundry:solution-architect", prompt="Review a proposed research experiment scope.\n\nGoal: <goal>\nScope files (newline-separated paths in a markdown code block):\n```\n<scope_files — one path per line>\n```\nMetric command: <metric_cmd>\n\nCheck: (1) Do scope_files cover the components relevant to the goal? List architectural dependencies outside scope that the ideation agent would need to touch. (2) Are there shared abstractions (base classes, imports, shared state) outside scope required for changes within it?\n\nWrite your full review to `<PLAN_RUN_DIR>/plan-review-architect.md` using the Write tool.\nReturn ONLY: {\"ok\":true|false,\"gaps\":[\"...\"],\"suggestions\":[\"...\"],\"file\":\"<PLAN_RUN_DIR>/plan-review-architect.md\",\"confidence\":0.N}")

If agent_strategy = ml or goal contains ML keywords (accuracy, loss, model, training, inference, classification, regression) — also spawn research:scientist. Substitute computed $PLAN_RUN_DIR before spawning:

Agent(subagent_type="research:scientist", prompt="Review a proposed ML experiment configuration.\n\nGoal: <goal>\nMetric command: <metric_cmd>\nAgent strategy: <agent_strategy>\n\nCheck: (1) Is the goal a well-formed ML hypothesis — falsifiable, with a concrete success criterion? (2) Could metric_cmd improve while the real goal is not achieved (Goodhart's Law)? (3) Is agent_strategy appropriate for this goal type?\n\nWrite your full review to `<PLAN_RUN_DIR>/plan-review-scientist.md` using the Write tool.\nReturn ONLY: {\"ok\":true|false,\"issues\":[\"...\"],\"suggestions\":[\"...\"],\"file\":\"<PLAN_RUN_DIR>/plan-review-scientist.md\",\"confidence\":0.N}")

If agent_strategy = perf or goal contains performance keywords (latency, throughput, wall-clock, speed, memory, FPS) — also spawn perf. Substitute computed $PLAN_RUN_DIR before spawning:

Agent(subagent_type="foundry:perf-optimizer", prompt="Review a proposed performance experiment configuration.\n\nGoal: <goal>\nMetric command: <metric_cmd>\nGuard command: <guard_cmd>\n\nCheck: (1) Does metric_cmd measure the right performance characteristic for this goal? (2) Is guard_cmd comprehensive enough to catch regressions an ideation agent might introduce?\n\nWrite your full review to `<PLAN_RUN_DIR>/plan-review-perf.md` using the Write tool.\nReturn ONLY: {\"ok\":true|false,\"issues\":[\"...\"],\"suggestions\":[\"...\"],\"file\":\"<PLAN_RUN_DIR>/plan-review-perf.md\",\"confidence\":0.N}")

Print advisory block below config:

Advisory review:
  architect: <gaps or "scope looks complete">
  scientist:  <issues or "hypothesis is well-formed">   [only if dispatched]
  perf:       <issues or "metric/guard look valid">      [only if dispatched]

Pre-check output path before presenting advisor results: resolve output path (second argument after <goal> if provided, else program.md at project root). Check if file exists: test -f <output_path> && echo "EXISTS". Record result as OUTPUT_EXISTS.

Any agent returns ok: false → surface suggestions, then invoke AskUserQuestion combining advisor feedback and (if OUTPUT_EXISTS) overwrite decision in one call:

question: "Advisor flagged issues (listed above). How to proceed?"
(a) Revise config → re-present P-P2 config block (re-enter P-P2; max 3 re-entries before forcing proceed-or-abort)
(b) Proceed with current config — if OUTPUT_EXISTS: warn "will overwrite <output_path>"; if not: proceed silently
(c) Abort — stop

All advisors return ok: true AND OUTPUT_EXISTS: invoke AskUserQuestion — (a) Overwrite <output_path> — proceed; (b) Abort — stop. All advisors return ok: true AND NOT OUTPUT_EXISTS: proceed directly to writing — no AskUserQuestion needed.

Step P-P3: Write program.md

Output path: resolved above in P-P2b pre-check.

Write file using canonical template, pre-populated from wizard findings:

# Program: <title from goal>

## Goal
<one-paragraph description of what to improve and why>

## Metric
```yaml
command: <metric_cmd from wizard>
direction: higher | lower
target: <optional numeric goal — campaign stops when crossed>
```

## Guard
```yaml
command: <guard_cmd from wizard>
```

## Config
```yaml
max_iterations: 20
agent_strategy: auto | perf | code | ml | arch
scope_files:
  - <path or glob>
compute: local | colab | docker
colab_hw: # optional: H100 | L4 | T4 | A100 (used when compute: colab)
sandbox_network: none | bridge  # ⚠ not validated by judge.md C-checks — manually verify before running
```

## Notes
<optional free-form text — strategy hints, context, known constraints — ignored by the skill>

Print:

✓ Program saved to <OUTPUT_PATH>

Next steps:
  /research:judge <OUTPUT_PATH>   ← validate plan before running (recommended)
  /research:run <OUTPUT_PATH>     ← start iteration loop directly

Step P-P4: --team flag

--team detected in $ARGUMENTS:

Precondition — P-P3 must have completed successfully (file written, no overwrite-abort). P-P3 aborted (user chose Abort at overwrite check, or any prior P-P step aborted): mark P-P4 task deleted, do NOT execute steps 2–4 below, and do NOT append to any pre-existing file. P-P4 owns only the append; P-P3 owns file existence and base template.

Complete Steps P-P0–P-P3 as normal — produce program.md with full single-researcher structure.
Append ## Team Mode Notes section to the program.md just written by P-P3 (never to a pre-existing file untouched by P-P3):
- Number of distinct method families found (determines team size at run step)
- Whether SOTA consensus exists — if clear winner, note team mode may not add value
Tell user: "--team applies at run step, not plan step. Run: /research:run <program.md> --team to execute with parallel researchers."

Resolve run-modes dir, read team protocol — include one-line summary in Team Mode Notes:

_RESEARCH_RUN_MODES=$(ls -td ~/.claude/plugins/cache/borda-ai-rig/research/*/skills/run/modes 2>/dev/null | head -1)
[ -d "$_RESEARCH_RUN_MODES" ] || _RESEARCH_RUN_MODES="$(git rev-parse --show-toplevel 2>/dev/null)/plugins/research/skills/run/modes"
[ -f "$_RESEARCH_RUN_MODES/team.md" ] || { echo "⚠ team.md not found at $_RESEARCH_RUN_MODES"; }

Read $_RESEARCH_RUN_MODES/team.md.

Scope boundary: plan writes program.md only — methodology validation = /research:judge; execution = /research:run; full pipeline = /research:sweep.
--team note: --team applies at run step, not plan step. Plan produces standard program.md; pass flag when invoking /research:run <program.md> --team.
TTL exemption: plan run dirs (.experiments/plan-<timestamp>/) don't write result.jsonl — exempt from 30-day TTL cleanup per .claude/rules/artifact-lifecycle.md (installed via /foundry:setup — requires foundry plugin); remove manually when no longer needed.

plan

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

plan

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Agent Resolution

Plan Mode (Steps P-P0–P-P4)

Step P-P0: Detect input type

Step P-P1: Parse and scan

Step P-P2: Present proposed config

Step P-P2b: Agent validation (pre-write)

Step P-P3: Write program.md

Step P-P4: --team flag

Similar Skills

Agent Resolution

Plan Mode (Steps P-P0–P-P4)

Step P-P0: Detect input type

Step P-P1: Parse and scan

Step P-P2: Present proposed config

Step P-P2b: Agent validation (pre-write)

Step P-P3: Write program.md

Step P-P4: --team flag

Similar Skills