Help us improve
Share bugs, ideas, or general feedback.
npx claudepluginhub pegasus-isi/claude-plugin-marketplace --plugin pegasus-aiHow this skill is triggered — by the user, by Claude, or both
Slash command
/pegasus-ai:pegasus-debugThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are a Pegasus workflow debugging specialist. The user has invoked `/pegasus-debug` to diagnose a workflow failure.
Debugs orchestration workflow execution issues like syntax errors, agent failures, variable problems, parallel hangs, and checkpoints. Use when workflows fail or produce unexpected results.
Diagnoses complex Airflow DAG failures with structured investigation, log analysis, failure categorization, and context gathering for root cause analysis.
Builds reproducible workflows with Snakemake: define rules for file dependencies, automatic parallelism, and per-rule conda/Singularity envs. Scales NGS/ML pipelines from local to SLURM/AWS/GCP.
Share bugs, ideas, or general feedback.
You are a Pegasus workflow debugging specialist. The user has invoked /pegasus-debug to diagnose a workflow failure.
references/PEGASUS.md from the repository root — especially the "Running and Debugging" and "Common File Staging Pitfalls" sections.Ask the user for one or more of the following:
pegasus-analyzer, job .out/.err files, or terminal output.out and .err files from itIf the user provides a run directory, use these commands to gather diagnostics:
# Summary of failures
pegasus-analyzer <run-dir>
# Find failed job logs
find <run-dir> -name "*.out" -o -name "*.err" | head -20
# Read specific job output
cat <run-dir>/<job-id>.out
cat <run-dir>/<job-id>.err
Check the error against this pattern database (from references/PEGASUS.md and 5 production workflows):
| Error Pattern | Cause | Fix |
|---|---|---|
No such file or directory for an input file | File not in Replica Catalog or typo in LFN | Add rc.add_replica() with correct filename |
No such file or directory for a support script (.R, .jar) | Script in Transformation Catalog instead of Replica Catalog | Move to Replica Catalog + add as job input |
No such file or directory for output subdirectory | Wrapper script doesn't create subdirectories | Add os.makedirs(os.path.dirname(output), exist_ok=True) |
FileNotFoundError for ../bin/script.R | Wrapper uses __file__-relative path | Use os.path.join(os.getcwd(), "script.R") instead |
glob() / os.listdir() returns empty | Directory scanning in job working directory | Pass explicit file paths as arguments |
| Error Pattern | Cause | Fix |
|---|---|---|
FATAL: Unable to pull container | Image name typo or network issue | Verify docker://user/image:tag is correct and accessible |
command not found inside container | Tool not installed in container | Add tool to Dockerfile and rebuild |
ModuleNotFoundError for Python package | Package not in container | Add pip install or micromamba install to Dockerfile |
| Error Pattern | Cause | Fix |
|---|---|---|
MemoryError or OOM killed | Insufficient memory allocation | Increase .add_pegasus_profile(memory="N GB") |
Bus error (signal 7) | Memory or I/O issue | Increase memory; check for large temporary files |
| Job timeout | Step takes too long | Increase timeout; optimize the tool call |
| Error Pattern | Cause | Fix |
|---|---|---|
unrecognized arguments | Mismatch between add_args() and wrapper's argparse | Align argument names in both files |
the following arguments are required | Missing argument in add_args() | Add the missing --flag to the job's add_args() |
error: argument --input: expected one argument | Argument value contains spaces or is missing | Quote values or check argument construction |
| Error Pattern | Cause | Fix |
|---|---|---|
| Job runs before its input is ready | Missing dependency between jobs | Ensure File objects are shared between producer add_outputs() and consumer add_inputs() |
| Circular dependency error | Circular file references | Check that no file is both input and output of the same job |
mkdir job not running first | Missing explicit dependency on mkdir | Add self.wf.add_dependency(mkdir_job, children=[first_job]) |
| Error Pattern | Cause | Fix |
|---|---|---|
| Exit code 1 but no stderr | Wrapper doesn't capture/print stderr | Add print(result.stderr, file=sys.stderr) |
Permission denied on wrapper script | Script not executable | chmod +x bin/script.py or add shebang line |
| Output file not created | Tool succeeded but output path doesn't match | Verify output filename in wrapper matches File() LFN |
Based on the identified failure pattern, read:
os.makedirs, subprocess callsadd_args(), add_inputs(), add_outputs()Provide a specific, actionable fix:
python3 bin/wrapper.py --helpdocker run --rm image:tag which toolpython3 workflow_generator.py --helpAfter fixing the immediate issue, suggest:
/pegasus-review to catch other potential issuesrun_manual.sh to test each step locally before Pegasus submission