From chrisvoncsefalvay-funsloth
Generate comprehensive model cards and upload fine-tuned models to Hugging Face Hub with professional documentation
npx claudepluginhub joshuarweaver/cascade-ai-ml-engineering --plugin chrisvoncsefalvay-funslothThis skill uses the workspace's default tool permissions.
Create model cards and upload fine-tuned models to Hugging Face Hub.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Create model cards and upload fine-tuned models to Hugging Face Hub.
If coming from training manager, you should have:
model_path, base_model, dataset, techniquetraining_config (LoRA rank, LR, epochs)final_loss, training_time, hardwareIf missing, ask for essential information.
Ask for:
username/model-nameOptions:
If GGUF selected, ask which levels. See references/GGUF_GUIDE.md.
| Method | Size | Quality |
|---|---|---|
| Q4_K_M | ~4GB | Good (Recommended) |
| Q5_K_M | ~5GB | Better |
| Q8_0 | ~8GB | Best |
Create README.md with:
from huggingface_hub import create_repo
create_repo("username/model-name", private=False, exist_ok=True)
from huggingface_hub import HfApi
api = HfApi()
# LoRA adapter
api.upload_folder(folder_path="./outputs/lora_adapter", repo_id="username/model")
# Model card
api.upload_file(path_or_fileobj="README.md", path_in_repo="README.md", repo_id="username/model")
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained("./outputs/lora_adapter")
model.save_pretrained_gguf("./gguf", tokenizer, quantization_method="q4_k_m")
Use scripts/convert_gguf.py for multiple quantizations.
from huggingface_hub import list_repo_files
print(list_repo_files("username/model"))
Upload Complete!
Model: https://huggingface.co/{repo_name}
Uploaded:
- LoRA adapter
- Model card
- GGUF files (if selected)
Next steps:
- Verify model page
- Add example outputs
- Run benchmarks
- Share on social media
| Error | Resolution |
|---|---|
| Repo exists | Use exist_ok=True |
| Permission denied | Check HF token has write access |
| Upload timeout | Use chunked upload |