Skill

run-experiment

Deploy and run ML experiments on local or remote GPU servers. Use when user says "run experiment", "deploy to server", "跑实验", or needs to launch training jobs.

npx claudepluginhub llv22/autoresearchwitheyes

Tool Access

This skill is limited to using the following tools:

Bash(*)ReadGrepGlobEditWriteAgent

Preview

Deploy and run ML experiment: $ARGUMENTS

SKILL.md

Similar Skills

Auto Deep Researcher 24x7

Autonomously runs deep learning experiments 24/7 in a THINK-EXECUTE-REFLECT loop with zero-cost GPU monitoring, Leader-Worker architecture, and constant-size memory.

aradotso-trending-skills-37

setting-up-experiment-tracking

1.9k

Sets up ML experiment tracking with MLflow or Weights & Biases: installs packages, initializes tools, and provides logging code for parameters, metrics, and artifacts.

3 files6 tools

experiment-tracking-setup

remote-cluster-agent

Manages remote GPU clusters via rca CLI: run commands, batch jobs, GPU/node inspection, file sync with mutagen. For training, remote exec, status checks.

20 files

sjh-skills

Stats

Stars0

Forks0

Last CommitMar 15, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Run Experiment

Deploy and run ML experiment: $ARGUMENTS

Workflow

Step 1: Detect Environment

Read the project's CLAUDE.md to determine the experiment environment:

Local GPU: Look for local CUDA/MPS setup info
Remote server: Look for SSH alias, conda env, code directory

If no server info is found in CLAUDE.md, ask the user.

Step 2: Pre-flight Check

Check GPU availability on the target machine:

Remote:

ssh <server> nvidia-smi --query-gpu=index,memory.used,memory.total --format=csv,noheader

Local:

nvidia-smi --query-gpu=index,memory.used,memory.total --format=csv,noheader
# or for Mac MPS:
python -c "import torch; print('MPS available:', torch.backends.mps.is_available())"

Free GPU = memory.used < 500 MiB.

Step 3: Sync Code (Remote Only)

Only sync necessary files — NOT data, checkpoints, or large files:

rsync -avz --include='*.py' --exclude='*' <local_src>/ <server>:<remote_dst>/

Step 4: Deploy

Remote (via SSH + screen)

For each experiment, create a dedicated screen session with GPU binding:

ssh <server> "screen -dmS <exp_name> bash -c '\
  eval \"\$(<conda_path>/conda shell.bash hook)\" && \
  conda activate <env> && \
  CUDA_VISIBLE_DEVICES=<gpu_id> python <script> <args> 2>&1 | tee <log_file>'"

Local

# Linux with CUDA
CUDA_VISIBLE_DEVICES=<gpu_id> python <script> <args> 2>&1 | tee <log_file>

# Mac with MPS (PyTorch uses MPS automatically)
python <script> <args> 2>&1 | tee <log_file>

For local long-running jobs, use run_in_background: true to keep the conversation responsive.

Step 5: Verify Launch

Remote:

ssh <server> "screen -ls"

Local: Check process is running and GPU is allocated.

Key Rules

ALWAYS check GPU availability first — never blindly assign GPUs
Each experiment gets its own screen session + GPU (remote) or background process (local)
Use tee to save logs for later inspection
Run deployment commands with run_in_background: true to keep conversation responsive
Report back: which GPU, which screen/process, what command, estimated time
If multiple experiments, launch them in parallel on different GPUs

CLAUDE.md Example

Users should add their server info to their project's CLAUDE.md:

## Remote Server
- SSH: `ssh my-gpu-server`
- GPU: 4x A100 (80GB each)
- Conda: `eval "$(/opt/conda/bin/conda shell.bash hook)" && conda activate research`
- Code dir: `/home/user/experiments/`

## Local Environment
- Mac MPS / Linux CUDA
- Conda env: `ml` (Python 3.10 + PyTorch)