SkyPilot

Run AI on Any Infrastructure

🌟 SkyPilot Demo 🌟: Click to see a 1-minute tour

SkyPilot is a system to run, manage, and scale AI workloads on any AI infrastructure.

SkyPilot gives AI teams a simple interface to run jobs on any infra. Infra teams get a unified control plane to manage any AI compute — with advanced scheduling, scaling, and orchestration.

:fire: News :fire:

[Mar 2026] Scaling Karpathy's Autoresearch: Autoresearch runs 1 experiment at a time. We gave it 16 GPUs and let it run in parallel: blog, HackerNews
[Mar 2026] SkyPilot Agent Skills: GPU access and job management for AI agents: docs
[Jan 2026] Shopify case study: Shopify runs all AI training workloads on SkyPilot: case study
[Dec 2025] SkyPilot v0.11 released: Multi-Cloud Pools, Fast Managed Jobs, Enterprise-Readiness at Large Scale, Programmability. Release notes
[Dec 2025] Train an agent to use Google Search as a tool with RL on your Kubernetes or clouds: blog, example
[Oct 2025] Run RL training for LLMs with SkyRL on your Kubernetes or clouds: example

Overview

SkyPilot is easy to use for AI teams:

Quickly spin up compute on your own infra
Environment and job as code — simple and portable
Easy job management: queue, run, and auto-recover many jobs

SkyPilot makes Kubernetes easy for AI & Infra teams:

Slurm-like ease of use, cloud-native robustness
Local dev experience on K8s: SSH into pods, sync code, or connect IDE
Turbocharge your clusters: gang scheduling, multi-cluster, and scaling

SkyPilot unifies multiple clusters, clouds, and hardware:

One interface to use reserved GPUs, Kubernetes clusters, Slurm clusters, or 20+ clouds
Flexible provisioning of GPUs, TPUs, CPUs, with auto-retry
Team deployment and resource sharing

SkyPilot cuts your cloud costs & maximizes GPU availability:

Autostop: automatic cleanup of idle resources
Spot instance support: 3-6x cost savings, with preemption auto-recovery
Intelligent scheduling: automatically run on the cheapest & most available infra

SkyPilot supports your existing GPU, TPU, and CPU workloads, with no code changes.

Install with pip:

# Choose your clouds:
pip install -U "skypilot[kubernetes,aws,gcp,azure,oci,nebius,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,seeweb,shadeform,verda]"

To get the latest features and fixes, use the nightly build or install from source:

# Choose your clouds:
pip install "skypilot-nightly[kubernetes,aws,gcp,azure,oci,nebius,lambda,runpod,fluidstack,paperspace,cudo,ibm,scp,seeweb,shadeform,verda]"

To use SkyPilot directly with your agent (Claude Code, Codex, etc.), install the SkyPilot Skill. Tell your agent:

Fetch and follow https://github.com/skypilot-org/skypilot/blob/HEAD/agent/INSTALL.md to install the skypilot skill

skypilot

Popularity

Health & Quality

What's Inside

README

Run AI on Any Infrastructure

🌟 SkyPilot Demo 🌟: Click to see a 1-minute tour

Overview

Confidence

Similar Plugins

vastai-pack

ml-training

caveman

frontend-design

ui-design

claude-mem