Training manager for local GPU training - validate CUDA, manage GPU selection, monitor progress, handle checkpoints
/plugin marketplace add chrisvoncsefalvay/funsloth/plugin install funsloth@funslothThis skill inherits all available tools. When active, it can use any tool Claude has access to.
notebooks/sft_template.ipynbreferences/HARDWARE_GUIDE.mdreferences/TROUBLESHOOTING.mdscripts/train_sft.pyRun Unsloth training on your local GPU.
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0)}")
print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
If CUDA not available:
nvidia-sminvcc --versionpip install torch --index-url https://download.pytorch.org/whl/cu121See references/HARDWARE_GUIDE.md for requirements:
| VRAM | Recommended Setup |
|---|---|
| 8GB | 7B, 4-bit, batch=1, LoRA r=8 |
| 12GB | 7B, 4-bit, batch=2, LoRA r=16 |
| 16GB | 7-13B, 4-bit, batch=2, LoRA r=16-32 |
| 24GB | 7-14B, 4-bit, batch=4, LoRA r=32 |
pip install unsloth torch transformers trl peft datasets accelerate bitsandbytes
Use the official Unsloth Docker image for a pre-configured environment (supports all GPUs including Blackwell/50-series):
docker run -d \
-e JUPYTER_PASSWORD="unsloth" \
-p 8888:8888 \
-v $(pwd)/work:/workspace/work \
--gpus all \
unsloth/unsloth
Access Jupyter at http://localhost:8888. Example notebooks are in /workspace/unsloth-notebooks/.
Environment variables:
JUPYTER_PASSWORD - Jupyter auth (default: unsloth)JUPYTER_PORT - Port (default: 8888)USER_PASSWORD - User/sudo password (default: unsloth)jupyter notebook notebooks/sft_template.ipynb
# Edit configuration in script, then run
python scripts/train_sft.py
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Use first GPU
# Watch GPU usage
watch -n 1 nvidia-smi
# Or use nvitop (more detailed)
pip install nvitop && nvitop
export WANDB_API_KEY="your-key"
# Add report_to="wandb" in TrainingArguments
Try in order:
torch.cuda.empty_cache()packing=True for short sequencesSee references/TROUBLESHOOTING.md for more solutions.
TrainingArguments(
resume_from_checkpoint=True, # Auto-find latest
# Or: resume_from_checkpoint="outputs/checkpoint-500"
)
Training script automatically saves:
outputs/lora_adapter/ - LoRA weightsoutputs/merged_16bit/ - Merged model (optional)from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained("outputs/lora_adapter")
FastLanguageModel.for_inference(model)
messages = [{"role": "user", "content": "Hello!"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))
Offer funsloth-upload for Hub upload with model card.
save_stepsUse when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.