This skill should be used when users want to run any workload on Hugging Face Jobs infrastructure. Covers UV scripts, Docker-based jobs, hardware selection, cost estimation, authentication with tokens, secrets management, timeout configuration, and result persistence. Designed for general-purpose compute workloads including data processing, inference, experiments, batch jobs, and any Python-based tasks. Should be invoked for tasks involving cloud compute, GPU workloads, or when users mention running jobs on Hugging Face infrastructure without local setup.
/plugin marketplace add huggingface/skills/plugin install huggingface-huggingface-skills@huggingface/skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
index.htmlreferences/hardware_guide.mdreferences/hub_saving.mdreferences/token_usage.mdreferences/troubleshooting.mdscripts/cot-self-instruct.pyscripts/finepdfs-stats.pyscripts/generate-responses.pyRun any workload on fully managed Hugging Face infrastructure. No local setup required—jobs run on cloud CPUs, GPUs, or TPUs and can persist results to the Hugging Face Hub.
Common use cases:
model-trainer skill for TRL-specific training)For model training specifically: See the model-trainer skill for TRL-based training workflows.
Use this skill when users want to:
When assisting with jobs:
ALWAYS use hf_jobs() MCP tool - Submit jobs using hf_jobs("uv", {...}) or hf_jobs("run", {...}). The script parameter accepts Python code directly. Do NOT save to local files unless the user explicitly requests it. Pass the script content as a string to hf_jobs().
Always handle authentication - Jobs that interact with the Hub require HF_TOKEN via secrets. See Token Usage section below.
Provide job details after submission - After submitting, provide job ID, monitoring URL, estimated time, and note that the user can request status checks later.
Set appropriate timeouts - Default 30min may be insufficient for long-running tasks.
Before starting any job, verify:
hf_whoami()When tokens are required:
How to provide tokens:
{
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # Recommended: automatic token
}
⚠️ CRITICAL: The $HF_TOKEN placeholder is automatically replaced with your logged-in token. Never hardcode tokens in scripts.
What are HF Tokens?
hf auth loginToken Types:
Always Required:
Not Required:
hf_jobs("uv", {
"script": "your_script.py",
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # ✅ Automatic replacement
})
How it works:
$HF_TOKEN is a placeholder that gets replaced with your actual tokenhf auth login)Benefits:
hf_jobs("uv", {
"script": "your_script.py",
"secrets": {"HF_TOKEN": "hf_abc123..."} # ⚠️ Hardcoded token
})
When to use:
Security concerns:
hf_jobs("uv", {
"script": "your_script.py",
"env": {"HF_TOKEN": "hf_abc123..."} # ⚠️ Less secure than secrets
})
Difference from secrets:
env variables are visible in job logssecrets are encrypted server-sidesecrets for tokensIn your Python script, tokens are available as environment variables:
# /// script
# dependencies = ["huggingface-hub"]
# ///
import os
from huggingface_hub import HfApi
# Token is automatically available if passed via secrets
token = os.environ.get("HF_TOKEN")
# Use with Hub API
api = HfApi(token=token)
# Or let huggingface_hub auto-detect
api = HfApi() # Automatically uses HF_TOKEN env var
Best practices:
os.environ.get("HF_TOKEN") to accesshuggingface_hub auto-detect when possibleCheck if you're logged in:
from huggingface_hub import whoami
user_info = whoami() # Returns your username if authenticated
Verify token in job:
import os
assert "HF_TOKEN" in os.environ, "HF_TOKEN not found!"
token = os.environ["HF_TOKEN"]
print(f"Token starts with: {token[:7]}...") # Should start with "hf_"
Error: 401 Unauthorized
secrets={"HF_TOKEN": "$HF_TOKEN"} to job confighf_whoami() works locallyError: 403 Forbidden
Error: Token not found in environment
secrets not passed or wrong key namesecrets={"HF_TOKEN": "$HF_TOKEN"} (not env)os.environ.get("HF_TOKEN")Error: Repository access denied
$HF_TOKEN placeholder or environment variables# Example: Push results to Hub
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["huggingface-hub", "datasets"]
# ///
import os
from huggingface_hub import HfApi
from datasets import Dataset
# Verify token is available
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"
# Use token for Hub operations
api = HfApi(token=os.environ["HF_TOKEN"])
# Create and push dataset
data = {"text": ["Hello", "World"]}
dataset = Dataset.from_dict(data)
dataset.push_to_hub("username/my-dataset", token=os.environ["HF_TOKEN"])
print("✅ Dataset pushed successfully!")
""",
"flavor": "cpu-basic",
"timeout": "30m",
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # ✅ Token provided securely
})
UV scripts use PEP 723 inline dependencies for clean, self-contained workloads.
MCP Tool:
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["transformers", "torch"]
# ///
from transformers import pipeline
import torch
# Your workload here
classifier = pipeline("sentiment-analysis")
result = classifier("I love Hugging Face!")
print(result)
""",
"flavor": "cpu-basic",
"timeout": "30m"
})
CLI Equivalent:
hf jobs uv run my_script.py --flavor cpu-basic --timeout 30m
Python API:
from huggingface_hub import run_uv_job
run_uv_job("my_script.py", flavor="cpu-basic", timeout="30m")
Benefits: Direct MCP tool usage, clean code, dependencies declared inline, no file saving required
When to use: Default choice for all workloads, custom logic, any scenario requiring hf_jobs()
By default, UV scripts use ghcr.io/astral-sh/uv:python3.12-bookworm-slim. For ML workloads with complex dependencies, use pre-built images:
hf_jobs("uv", {
"script": "inference.py",
"image": "vllm/vllm-openai:latest", # Pre-built image with vLLM
"flavor": "a10g-large"
})
CLI:
hf jobs uv run --image vllm/vllm-openai:latest --flavor a10g-large inference.py
Benefits: Faster startup, pre-installed dependencies, optimized for specific frameworks
By default, UV scripts use Python 3.12. Specify a different version:
hf_jobs("uv", {
"script": "my_script.py",
"python": "3.11", # Use Python 3.11
"flavor": "cpu-basic"
})
Python API:
from huggingface_hub import run_uv_job
run_uv_job("my_script.py", python="3.11")
⚠️ Important: There are two "script path" stories depending on how you run Jobs:
hf_jobs() MCP tool (recommended in this repo): the script value must be inline code (a string) or a URL. A local filesystem path (like "./scripts/foo.py") won't exist inside the remote container.hf jobs uv run CLI: local file paths do work (the CLI uploads your script).Common mistake with hf_jobs() MCP tool:
# ❌ Will fail (remote container can't see your local path)
hf_jobs("uv", {"script": "./scripts/foo.py"})
Correct patterns with hf_jobs() MCP tool:
# ✅ Inline: read the local script file and pass its *contents*
from pathlib import Path
script = Path("hf-jobs/scripts/foo.py").read_text()
hf_jobs("uv", {"script": script})
# ✅ URL: host the script somewhere reachable
hf_jobs("uv", {"script": "https://huggingface.co/datasets/uv-scripts/.../raw/main/foo.py"})
# ✅ URL from GitHub
hf_jobs("uv", {"script": "https://raw.githubusercontent.com/huggingface/trl/main/trl/scripts/sft.py"})
CLI equivalent (local paths supported):
hf jobs uv run ./scripts/foo.py -- --your --args
Add extra dependencies beyond what's in the PEP 723 header:
hf_jobs("uv", {
"script": "inference.py",
"dependencies": ["transformers", "torch>=2.0"], # Extra deps
"flavor": "a10g-small"
})
Python API:
from huggingface_hub import run_uv_job
run_uv_job("inference.py", dependencies=["transformers", "torch>=2.0"])
Run jobs with custom Docker images and commands.
MCP Tool:
hf_jobs("run", {
"image": "python:3.12",
"command": ["python", "-c", "print('Hello from HF Jobs!')"],
"flavor": "cpu-basic",
"timeout": "30m"
})
CLI Equivalent:
hf jobs run python:3.12 python -c "print('Hello from HF Jobs!')"
Python API:
from huggingface_hub import run_job
run_job(image="python:3.12", command=["python", "-c", "print('Hello!')"], flavor="cpu-basic")
Benefits: Full Docker control, use pre-built images, run any command When to use: Need specific Docker images, non-Python workloads, complex environments
Example with GPU:
hf_jobs("run", {
"image": "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel",
"command": ["python", "-c", "import torch; print(torch.cuda.get_device_name())"],
"flavor": "a10g-small",
"timeout": "1h"
})
Using Hugging Face Spaces as Images:
You can use Docker images from HF Spaces:
hf_jobs("run", {
"image": "hf.co/spaces/lhoestq/duckdb", # Space as Docker image
"command": ["duckdb", "-c", "SELECT 'Hello from DuckDB!'"],
"flavor": "cpu-basic"
})
CLI:
hf jobs run hf.co/spaces/lhoestq/duckdb duckdb -c "SELECT 'Hello!'"
The uv-scripts organization provides ready-to-use UV scripts stored as datasets on Hugging Face Hub:
# Discover available UV script collections
dataset_search({"author": "uv-scripts", "sort": "downloads", "limit": 20})
# Explore a specific collection
hub_repo_details(["uv-scripts/classification"], repo_type="dataset", include_readme=True)
Popular collections: OCR, classification, synthetic-data, vLLM, dataset-creation
Reference: HF Jobs Hardware Docs (updated 07/2025)
| Workload Type | Recommended Hardware | Use Case |
|---|---|---|
| Data processing, testing | cpu-basic, cpu-upgrade | Lightweight tasks |
| Small models, demos | t4-small | <1B models, quick tests |
| Medium models | t4-medium, l4x1 | 1-7B models |
| Large models, production | a10g-small, a10g-large | 7-13B models |
| Very large models | a100-large | 13B+ models |
| Batch inference | a10g-large, a100-large | High-throughput |
| Multi-GPU workloads | l4x4, a10g-largex2, a10g-largex4 | Parallel/large models |
| TPU workloads | v5e-1x1, v5e-2x2, v5e-2x4 | JAX/Flax, TPU-optimized |
All Available Flavors:
cpu-basic, cpu-upgradet4-small, t4-medium, l4x1, l4x4, a10g-small, a10g-large, a10g-largex2, a10g-largex4, a100-largev5e-1x1, v5e-2x2, v5e-2x4Guidelines:
references/hardware_guide.md for detailed specifications⚠️ EPHEMERAL ENVIRONMENT—MUST PERSIST RESULTS
The Jobs environment is temporary. All files are deleted when the job ends. If results aren't persisted, ALL WORK IS LOST.
1. Push to Hugging Face Hub (Recommended)
# Push models
model.push_to_hub("username/model-name", token=os.environ["HF_TOKEN"])
# Push datasets
dataset.push_to_hub("username/dataset-name", token=os.environ["HF_TOKEN"])
# Push artifacts
api.upload_file(
path_or_fileobj="results.json",
path_in_repo="results.json",
repo_id="username/results",
token=os.environ["HF_TOKEN"]
)
2. Use External Storage
# Upload to S3, GCS, etc.
import boto3
s3 = boto3.client('s3')
s3.upload_file('results.json', 'my-bucket', 'results.json')
3. Send Results via API
# POST results to your API
import requests
requests.post("https://your-api.com/results", json=results)
In job submission:
{
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # Enables authentication
}
In script:
import os
from huggingface_hub import HfApi
# Token automatically available from secrets
api = HfApi(token=os.environ.get("HF_TOKEN"))
# Push your results
api.upload_file(...)
Before submitting:
secrets={"HF_TOKEN": "$HF_TOKEN"} if using HubSee: references/hub_saving.md for detailed Hub persistence guide
⚠️ DEFAULT: 30 MINUTES
Jobs automatically stop after the timeout. For long-running tasks like training, always set a custom timeout.
MCP Tool:
{
"timeout": "2h" # 2 hours
}
Supported formats:
300 = 5 minutes)"5m" (minutes), "2h" (hours), "1d" (days)"90m", "2h", "1.5h", 300, "1d"Python API:
from huggingface_hub import run_job, run_uv_job
run_job(image="python:3.12", command=[...], timeout="2h")
run_uv_job("script.py", timeout=7200) # 2 hours in seconds
| Scenario | Recommended | Notes |
|---|---|---|
| Quick test | 10-30 min | Verify setup |
| Data processing | 1-2 hours | Depends on data size |
| Batch inference | 2-4 hours | Large batches |
| Experiments | 4-8 hours | Multiple runs |
| Long-running | 8-24 hours | Production workloads |
Always add 20-30% buffer for setup, network delays, and cleanup.
On timeout: Job killed immediately, all unsaved progress lost
General guidelines:
Total Cost = (Hours of runtime) × (Cost per hour)
Example calculations:
Quick test:
Data processing:
Batch inference:
Cost optimization tips:
MCP Tool:
# List all jobs
hf_jobs("ps")
# Inspect specific job
hf_jobs("inspect", {"job_id": "your-job-id"})
# View logs
hf_jobs("logs", {"job_id": "your-job-id"})
# Cancel a job
hf_jobs("cancel", {"job_id": "your-job-id"})
Python API:
from huggingface_hub import list_jobs, inspect_job, fetch_job_logs, cancel_job
# List your jobs
jobs = list_jobs()
# List running jobs only
running = [j for j in list_jobs() if j.status.stage == "RUNNING"]
# Inspect specific job
job_info = inspect_job(job_id="your-job-id")
# View logs
for log in fetch_job_logs(job_id="your-job-id"):
print(log)
# Cancel a job
cancel_job(job_id="your-job-id")
CLI:
hf jobs ps # List jobs
hf jobs logs <job-id> # View logs
hf jobs cancel <job-id> # Cancel job
Remember: Wait for user to request status checks. Avoid polling repeatedly.
After submission, jobs have monitoring URLs:
https://huggingface.co/jobs/username/job-id
View logs, status, and details in the browser.
import time
from huggingface_hub import inspect_job, run_job
# Run multiple jobs
jobs = [run_job(image=img, command=cmd) for img, cmd in workloads]
# Wait for all to complete
for job in jobs:
while inspect_job(job_id=job.id).status.stage not in ("COMPLETED", "ERROR"):
time.sleep(10)
Run jobs on a schedule using CRON expressions or predefined schedules.
MCP Tool:
# Schedule a UV script that runs every hour
hf_jobs("scheduled uv", {
"script": "your_script.py",
"schedule": "@hourly",
"flavor": "cpu-basic"
})
# Schedule with CRON syntax
hf_jobs("scheduled uv", {
"script": "your_script.py",
"schedule": "0 9 * * 1", # 9 AM every Monday
"flavor": "cpu-basic"
})
# Schedule a Docker-based job
hf_jobs("scheduled run", {
"image": "python:3.12",
"command": ["python", "-c", "print('Scheduled!')"],
"schedule": "@daily",
"flavor": "cpu-basic"
})
Python API:
from huggingface_hub import create_scheduled_job, create_scheduled_uv_job
# Schedule a Docker job
create_scheduled_job(
image="python:3.12",
command=["python", "-c", "print('Running on schedule!')"],
schedule="@hourly"
)
# Schedule a UV script
create_scheduled_uv_job("my_script.py", schedule="@daily", flavor="cpu-basic")
# Schedule with GPU
create_scheduled_uv_job(
"ml_inference.py",
schedule="0 */6 * * *", # Every 6 hours
flavor="a10g-small"
)
Available schedules:
@annually, @yearly - Once per year@monthly - Once per month@weekly - Once per week@daily - Once per day@hourly - Once per hour"*/5 * * * *" for every 5 minutes)Manage scheduled jobs:
# MCP Tool
hf_jobs("scheduled ps") # List scheduled jobs
hf_jobs("scheduled inspect", {"job_id": "..."}) # Inspect details
hf_jobs("scheduled suspend", {"job_id": "..."}) # Pause
hf_jobs("scheduled resume", {"job_id": "..."}) # Resume
hf_jobs("scheduled delete", {"job_id": "..."}) # Delete
Python API for management:
from huggingface_hub import (
list_scheduled_jobs,
inspect_scheduled_job,
suspend_scheduled_job,
resume_scheduled_job,
delete_scheduled_job
)
# List all scheduled jobs
scheduled = list_scheduled_jobs()
# Inspect a scheduled job
info = inspect_scheduled_job(scheduled_job_id)
# Suspend (pause) a scheduled job
suspend_scheduled_job(scheduled_job_id)
# Resume a scheduled job
resume_scheduled_job(scheduled_job_id)
# Delete a scheduled job
delete_scheduled_job(scheduled_job_id)
Trigger jobs automatically when changes happen in Hugging Face repositories.
Python API:
from huggingface_hub import create_webhook
# Create webhook that triggers a job when a repo changes
webhook = create_webhook(
job_id=job.id,
watched=[
{"type": "user", "name": "your-username"},
{"type": "org", "name": "your-org-name"}
],
domains=["repo", "discussion"],
secret="your-secret"
)
How it works:
WEBHOOK_PAYLOAD environment variableUse cases:
Access webhook payload in script:
import os
import json
payload = json.loads(os.environ.get("WEBHOOK_PAYLOAD", "{}"))
print(f"Event type: {payload.get('event', {}).get('action')}")
See Webhooks Documentation for more details.
This repository ships ready-to-run UV scripts in hf-jobs/scripts/. Prefer using them instead of inventing new templates.
scripts/generate-responses.pyWhat it does: loads a Hub dataset (chat messages or a prompt column), applies a model chat template, generates responses with vLLM, and pushes the output dataset + dataset card back to the Hub.
Requires: GPU + write token (it pushes a dataset).
from pathlib import Path
script = Path("hf-jobs/scripts/generate-responses.py").read_text()
hf_jobs("uv", {
"script": script,
"script_args": [
"username/input-dataset",
"username/output-dataset",
"--messages-column", "messages",
"--model-id", "Qwen/Qwen3-30B-A3B-Instruct-2507",
"--temperature", "0.7",
"--top-p", "0.8",
"--max-tokens", "2048",
],
"flavor": "a10g-large",
"timeout": "4h",
"secrets": {"HF_TOKEN": "$HF_TOKEN"},
})
scripts/cot-self-instruct.pyWhat it does: generates synthetic prompts/answers via CoT Self-Instruct, optionally filters outputs (answer-consistency / RIP), then pushes the generated dataset + dataset card to the Hub.
Requires: GPU + write token (it pushes a dataset).
from pathlib import Path
script = Path("hf-jobs/scripts/cot-self-instruct.py").read_text()
hf_jobs("uv", {
"script": script,
"script_args": [
"--seed-dataset", "davanstrien/s1k-reasoning",
"--output-dataset", "username/synthetic-math",
"--task-type", "reasoning",
"--num-samples", "5000",
"--filter-method", "answer-consistency",
],
"flavor": "l4x4",
"timeout": "8h",
"secrets": {"HF_TOKEN": "$HF_TOKEN"},
})
scripts/finepdfs-stats.pyWhat it does: scans parquet directly from Hub (no 300GB download), computes temporal stats, and (optionally) uploads results to a Hub dataset repo.
Requires: CPU is often enough; token needed only if you pass --output-repo (upload).
from pathlib import Path
script = Path("hf-jobs/scripts/finepdfs-stats.py").read_text()
hf_jobs("uv", {
"script": script,
"script_args": [
"--limit", "10000",
"--show-plan",
"--output-repo", "username/finepdfs-temporal-stats",
],
"flavor": "cpu-upgrade",
"timeout": "2h",
"env": {"HF_XET_HIGH_PERFORMANCE": "1"},
"secrets": {"HF_TOKEN": "$HF_TOKEN"},
})
Fix:
Fix:
"timeout": "3h"Fix:
secrets={"HF_TOKEN": "$HF_TOKEN"}assert "HF_TOKEN" in os.environFix: Add to PEP 723 header:
# /// script
# dependencies = ["package1", "package2>=1.0.0"]
# ///
Fix:
hf_whoami() works locallysecrets={"HF_TOKEN": "$HF_TOKEN"} in job confighf auth loginCommon issues:
See: references/troubleshooting.md for complete troubleshooting guide
references/token_usage.md - Complete token usage guidereferences/hardware_guide.md - Hardware specs and selectionreferences/hub_saving.md - Hub persistence guidereferences/troubleshooting.md - Common issues and solutionsscripts/generate-responses.py - vLLM batch generation: dataset → responses → push to Hubscripts/cot-self-instruct.py - CoT Self-Instruct synthetic data generation + filtering → push to Hubscripts/finepdfs-stats.py - Polars streaming stats over finepdfs-edu parquet on Hub (optional push)Official Documentation:
Related Tools:
script parameter accepts Python code directly; no file saving required unless user requestssecrets={"HF_TOKEN": "$HF_TOKEN"} for Hub operationshf_jobs("uv", {...}) with inline scripts for Python workloads| Operation | MCP Tool | CLI | Python API |
|---|---|---|---|
| Run UV script | hf_jobs("uv", {...}) | hf jobs uv run script.py | run_uv_job("script.py") |
| Run Docker job | hf_jobs("run", {...}) | hf jobs run image cmd | run_job(image, command) |
| List jobs | hf_jobs("ps") | hf jobs ps | list_jobs() |
| View logs | hf_jobs("logs", {...}) | hf jobs logs <id> | fetch_job_logs(job_id) |
| Cancel job | hf_jobs("cancel", {...}) | hf jobs cancel <id> | cancel_job(job_id) |
| Schedule UV | hf_jobs("scheduled uv", {...}) | - | create_scheduled_uv_job() |
| Schedule Docker | hf_jobs("scheduled run", {...}) | - | create_scheduled_job() |
Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.