From full
Generates parameter sweep configurations, manages batch simulation campaigns, monitors job completion, and aggregates results. Use for systematic parameter studies or multi-run studies.
How this skill is triggered — by the user, by Claude, or both
Slash command
/full:simulation-orchestratorThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Provide tools to manage multi-simulation campaigns: generate parameter sweeps, track job execution status, and aggregate results from completed runs.
CHANGELOG.mdevals/evals.jsonevals/files/case3_init/config_0000.jsonevals/files/case3_init/config_0001.jsonevals/files/case3_init/config_0002.jsonevals/files/case3_init/config_0003.jsonevals/files/case3_init/manifest.jsonevals/files/case3_status/campaign.jsonevals/files/case4_campaign/campaign.jsonevals/files/case4_campaign/config_0000.jsonevals/files/case4_campaign/config_0001.jsonevals/files/case4_campaign/config_0002.jsonevals/files/case4_campaign/result_job_0000.jsonevals/files/case4_campaign/result_job_0001.jsonevals/files/case4_campaign/result_job_0002.jsonevals/files/config.jsonevals/files/simulation.jsonreferences/aggregation_methods.mdreferences/campaign_patterns.mdreferences/sweep_strategies.mdProvide tools to manage multi-simulation campaigns: generate parameter sweeps, track job execution status, and aggregate results from completed runs.
Before running orchestration scripts, collect from the user:
| Input | Description | Example |
|---|---|---|
| Base config | Template simulation configuration | base_config.json |
| Parameter ranges | Parameters to sweep with bounds | dt:[1e-4,1e-2],kappa:[0.1,1.0] |
| Sweep method | How to sample parameter space | grid, lhs, linspace |
| Output directory | Where to store campaign files | ./campaign_001 |
| Simulation command | Command to run each simulation | python sim.py --config {config} |
Need every combination (full factorial)?
├── YES → Use grid (warning: exponential growth with parameters)
└── NO → Is space-filling coverage needed?
├── YES → Use lhs (Latin Hypercube Sampling)
└── NO → Use linspace for uniform sampling per parameter
| Method | Best For | Sample Count |
|---|---|---|
grid | Low dimensions (1-3), need exact corners | n^d (exponential) |
linspace | 1D sweeps, uniform spacing | n per parameter |
lhs | High dimensions, space-filling | user-specified budget |
| Parameters | Grid Points Each | Total Runs | Recommendation |
|---|---|---|---|
| 1 | 10 | 10 | Grid is fine |
| 2 | 10 | 100 | Grid acceptable |
| 3 | 10 | 1,000 | Consider LHS |
| 4+ | 10 | 10,000+ | Use LHS or DOE |
| Script | Output Fields |
|---|---|
scripts/sweep_generator.py | configs, parameter_space, sweep_method, total_runs |
scripts/campaign_manager.py --action init | campaign_id, total_jobs, config_dir, command_template |
scripts/campaign_manager.py --action status | campaign_id, status, jobs, progress, total_jobs, created_at |
scripts/campaign_manager.py --action list | jobs (array of job records) |
scripts/job_tracker.py | job_id, status, start_time, end_time, exit_code |
scripts/result_aggregator.py | summary (incl. minimize), statistics, best_run, failed_runs |
Note on swept parameter names:
sweep_generator.pywrites each swept value into the base config by key path. A bare name (e.g.kappa) overwrites a top-level key; a dot-notation name (e.g.parameters.kappa) targets a nested key. The swept key path must match where the solver reads the value — sweepingkappaagainst a config that nestsparameters.kappawould add an unused top-level key and silently leave the base value in place. Seereferences/sweep_strategies.md.
Create configurations for all parameter combinations:
python3 scripts/sweep_generator.py \
--base-config base_config.json \
--params "dt:1e-4:1e-2:5,kappa:0.1:1.0:3" \
--method linspace \
--output-dir ./campaign_001 \
--json
Create campaign tracking structure:
python3 scripts/campaign_manager.py \
--action init \
--config-dir ./campaign_001 \
--command "python sim.py --config {config}" \
--json
Monitor running jobs:
python3 scripts/job_tracker.py \
--campaign-dir ./campaign_001 \
--update \
--json
Combine results from completed runs:
python3 scripts/result_aggregator.py \
--campaign-dir ./campaign_001 \
--metric final_energy \
--json
result_aggregator.py minimizes by default: best_run is the run with the
lowest metric value (and summary.minimize is true). If higher is better
(e.g. yield, accuracy, throughput), pass --maximize so best_run becomes the
highest value:
# Higher is better -> select the maximum
python3 scripts/result_aggregator.py \
--campaign-dir ./campaign_001 \
--metric yield \
--maximize \
--json
Decision guidance: If higher is better (yield, accuracy, throughput), pass
--maximize; otherwise the reportedbest_runis the minimum.
# Generate 5x3=15 runs varying dt (5 values) and kappa (3 values)
python3 scripts/sweep_generator.py \
--base-config sim.json \
--params "dt:1e-4:1e-2:5,kappa:0.1:1.0:3" \
--method linspace \
--output-dir ./sweep_001 \
--json
# Generate LHS samples for 4 parameters with budget of 20 runs
python3 scripts/sweep_generator.py \
--base-config sim.json \
--params "dt:1e-4:1e-2,kappa:0.1:1.0,M:1e-6:1e-4,W:0.5:2.0" \
--method lhs \
--samples 20 \
--output-dir ./lhs_001 \
--json
# Check campaign status
python3 scripts/campaign_manager.py \
--action status \
--config-dir ./sweep_001 \
--json
# List jobs (read-only), optionally filtered by status
python3 scripts/campaign_manager.py \
--action list \
--config-dir ./sweep_001 \
--status-filter failed \
--json
# Get summary statistics from completed runs (minimize: best = lowest)
python3 scripts/result_aggregator.py \
--campaign-dir ./sweep_001 \
--metric final_energy \
--json
# Maximization metric: best = highest value (yield, accuracy, throughput)
python3 scripts/result_aggregator.py \
--campaign-dir ./sweep_001 \
--metric yield \
--maximize \
--json
User: I want to run a parameter sweep on dt and kappa for my phase-field simulation. I want to try 5 values of dt between 1e-4 and 1e-2, and 4 values of kappa between 0.1 and 1.0.
Agent workflow:
python3 scripts/sweep_generator.py \
--base-config simulation.json \
--params "dt:1e-4:1e-2:5,kappa:0.1:1.0:4" \
--method linspace \
--output-dir ./dt_kappa_sweep \
--json
python3 scripts/campaign_manager.py \
--action init \
--config-dir ./dt_kappa_sweep \
--command "python phase_field.py --config {config}" \
--json
python3 scripts/result_aggregator.py \
--campaign-dir ./dt_kappa_sweep \
--metric interface_width \
--json
| Error | Cause | Resolution |
|---|---|---|
Base config not found | Invalid file path | Verify base config file exists |
Invalid parameter format | Malformed param string | Use format name:min:max:count or name:min:max |
Output directory exists | Would overwrite | Use --force or choose new directory |
No completed jobs | No results to aggregate | Wait for jobs to complete or check for failures |
Metric not found | Result files missing field | Verify metric name in result JSON |
The simulation-orchestrator works with other simulation-workflow skills:
parameter-optimization simulation-orchestrator
│ │
│ DOE samples ────────────────>│ Generate configs
│ │
│ │ Run simulations
│ │
│<──────────────────────────── │ Aggregate results
│ │
│ Sensitivity analysis │
│ Optimizer selection │
parameter-optimization/doe_generator.py to get sample pointssimulation-orchestrator/sweep_generator.py to create configssimulation-orchestrator/result_aggregator.py to collect resultsparameter-optimization/sensitivity_summary.py to analyzeBefore trusting a campaign's best_run or summary statistics, record concrete evidence for each item:
config_NNNN.json and verified the swept parameter (e.g. parameters.kappa) holds the expected value at the expected nesting level, not a duplicate unused top-level key (sweep_generator.py writes by key path).result_aggregator.py --json: recorded summary.total_jobs, summary.completed, and summary.failed, and confirmed completed + failed == total_jobs. Any shortfall means runs were silently skipped (missing result file or extract_metric returned None) and must be investigated, not ignored.completed > 0 and that the recorded summary.metric matches the field the solver actually writes. A typo'd or absent metric makes extract_metric return None, yielding zero completed runs with no error.summary.minimize and confirmed it matches the intended direction (default minimize; --maximize for yield/accuracy/throughput) before quoting best_run.job_tracker.py "completed" as physical success: it flags a job completed purely from a result-file's existence and stamps exit_code 0 — independently checked the run's real exit status / solver logs for non-zero codes or NaN/Inf output.references/aggregation_methods.md) and confirmed best_run.value is physically plausible, not a crashed run that emitted a spurious extremum.--seed used and saved manifest.json (parameter bounds, total_runs, parameter_space) so the sample set is reproducible.| Tempting shortcut | Why it's wrong / what to do |
|---|---|
| "The job tracker says completed, so the run succeeded." | job_tracker.py marks "completed" whenever a result file exists and hard-codes exit_code 0 — it never reads the actual exit code. A crashed run that wrote a partial result file looks identical to a clean one. Check the solver's real exit status and output validity. |
"completed is high, so I have all my results." | Jobs with a missing result file or a metric that extract_metric can't read are silently skipped — neither counted as completed nor failed. Reconcile completed + failed against total_jobs; a gap means lost runs. |
"Aggregation returned a best_run, so that's the optimum." | By default the aggregator minimizes. If higher is better you must pass --maximize, or best_run is the worst point. Always record summary.minimize and confirm the direction. |
"I swept kappa, so the runs vary." | sweep_generator.py writes by key path. If the base config nests the value under parameters.kappa but you sweep the bare name kappa, every config keeps the original nested value and gains an unused top-level key — the sweep is scientifically meaningless. Sweep the exact dotted path the solver reads. |
| "The metric name is close enough." | A misspelled or absent metric makes extract_metric return None for every run, so completed is 0 and statistics are empty — with no error raised. Verify the metric matches the solver's output field exactly. |
| "Grid covers everything, so use it for all my parameters." | Grid is n^d — it explodes exponentially (4 params x 10 = 10,000 runs). For 4+ dimensions use lhs with a deliberate budget; reserve grid for 1-3 parameters. |
| "LHS is random, so I don't need to record anything." | LHS is reproducible only with a fixed --seed. Without recording the seed (and manifest.json), the sample set cannot be regenerated or defended. |
result_aggregator.py --metric) are validated against [a-zA-Z_][a-zA-Z0-9_.]* to prevent traversal or injection via crafted keyssweep_generator.py --params) are validated against [a-zA-Z_][a-zA-Z0-9_]*(.[a-zA-Z_][a-zA-Z0-9_]*)* (dot notation for nested keys); invalid names are rejectedcampaign_manager.py validates command templates to reject shell chaining operators (;, |, &, backticks, $)--params format strings are parsed and validated (name:min:max:count with finite numeric bounds — NaN/Inf rejected — min < max, and positive integer counts capped at 100,000); at most 32 parameters per sweep--method is validated against a fixed allowlist (grid, linspace, lhs)--samples is validated as a positive integer with an upper bound (max 1,000,000)--action is validated against a fixed allowlist (init, status, list); for the read-only list action, --status-filter is validated against pending, running, completed, failedsweep_generator.py reads a single base config file (JSON) specified by --base-config and writes generated configs to --output-dirresult_aggregator.py enforces a 10 MB file-size limit per result file, maximum JSON nesting depth, and strict numeric type checking (rejects bool, NaN, Inf)shlex.quote()allowed-tools excludes Bash to prevent the agent from executing arbitrary commands when processing untrusted simulation outputseval(), exec(), or dynamic code generationshell=True)references/campaign_patterns.md - Common campaign structuresreferences/sweep_strategies.md - Parameter sweep design guidancereferences/aggregation_methods.md - Result aggregation techniquesSee CHANGELOG.md for the authoritative, dated history. Summary:
sweep_generator.py, input-validation hardening (--params name/finite/count caps, --samples bounds), documented --maximize and the list action, corrected Script Outputs table and worked-example numbersnpx claudepluginhub heshamfs/materials-simulation-skills --plugin core-numericalRuns cartesian-product parameter sweeps over Proteina-Complexa design pipelines. Defines sweep YAML, generates configs, launches via SLURM, aggregates per-config metrics into summary CSV and manifest.
Designs experiments, ranks parameter influence, and selects optimization strategies for simulation calibration. Activates on queries like 'which parameters matter most' or 'how do I calibrate my model.'
Designs and executes Monte Carlo simulations to evaluate finite-sample properties of statistical estimators including bias, RMSE, coverage, size, and power.