Search everything...

Skill

safe-debug

From lllllllama-ai-paper-reproduction-skill

Diagnoses deep learning training failures like tracebacks, CUDA OOM, checkpoint loads, shape mismatches, NaN loss with conservative root-cause analysis before patching.

Python

ai-ml

npx claudepluginhub lllllllama/ai-research-workflow-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

- The user provides a traceback, terminal error, or concrete training or inference failure symptom.

Supporting Assets

agents/openai.yamlreferences/debug-policy.mdscripts/safe_debug.py

SKILL.md

Similar Skills

ml-debug

155

Diagnoses ML/AI failures like OOM, NaN, divergence, crashes, bad throughput, wrong outputs, and dependency conflicts using grounded framework docs and citations.

superml

debugger

Provides systematic debugging framework for root cause analysis after 2+ failed fixes, complex failures, intermittent bugs, and circular debugging.

oh-my-claude

debug

Performs root cause analysis for bugs by tracing errors through code, analyzing stack traces, forming and testing hypotheses, then hands off to fix. Auto-triggers on stack traces.

1 file

rune

Stats

Stars10

Forks0

Last CommitApr 1, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

safe-debug | lllllllama-ai-paper-reproduction-skill | ClaudePluginHub

Back to Skills

Skill

safe-debug

From lllllllama-ai-paper-reproduction-skill

Diagnoses deep learning training failures like tracebacks, CUDA OOM, checkpoint loads, shape mismatches, NaN loss with conservative root-cause analysis before patching.

Python

ai-ml

npx claudepluginhub lllllllama/ai-research-workflow-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

- The user provides a traceback, terminal error, or concrete training or inference failure symptom.

Supporting Assets

agents/openai.yamlreferences/debug-policy.mdscripts/safe_debug.py

SKILL.md

safe-debug

When to apply

The user provides a traceback, terminal error, or concrete training or inference failure symptom.
The user wants diagnosis, root-cause narrowing, and minimal patch suggestions before code is changed.
The user wants a safe debug flow with explicit human approval before mutation.

When not to apply

When the user wants a broad repository walkthrough without an active failure.
When the task is speculative experimentation or code adaptation.
When the user is asking for a large refactor or readability rewrite.

Clear boundaries

Diagnose first.
Do not modify repository code by default.
If a patch is needed, propose the smallest fix and require explicit approval first.
Escalate savepoint or branch creation before medium-risk or high-risk changes.

Output expectations

debug_outputs/DIAGNOSIS.md
debug_outputs/PATCH_PLAN.md
debug_outputs/status.json

Notes

Use references/debug-policy.md and the shared references/research-pitfall-checklist.md.

Similar Skills

ml-debug

155

Diagnoses ML/AI failures like OOM, NaN, divergence, crashes, bad throughput, wrong outputs, and dependency conflicts using grounded framework docs and citations.

superml

debugger

Provides systematic debugging framework for root cause analysis after 2+ failed fixes, complex failures, intermittent bugs, and circular debugging.

oh-my-claude

debug

Performs root cause analysis for bugs by tracing errors through code, analyzing stack traces, forming and testing hypotheses, then hands off to fix. Auto-triggers on stack traces.

1 file

rune

Stats

Stars10

Forks0

Last CommitApr 1, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.