npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin vastai-packThis skill is limited to using the following tools:
Complete checklist for running production GPU workloads on Vast.ai, covering account setup, instance selection, data safety, monitoring, and cost controls.
Deploys ML training jobs and inference services to Vast.ai GPU cloud using optimized Docker images, CLI scripting, and automation for GPU instance provisioning.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Share bugs, ideas, or general feedback.
Complete checklist for running production GPU workloads on Vast.ai, covering account setup, instance selection, data safety, monitoring, and cost controls.
>= 0.98 for production jobsinet_down >= 200 for data transferdph_total set in search queries#!/bin/bash
set -euo pipefail
echo "Vast.ai Production Readiness Check"
# 1. Auth
vastai show user --raw | python3 -c "
import sys, json; u=json.load(sys.stdin)
balance = u.get('balance', 0)
print(f' Auth: OK | Balance: \${balance:.2f}')
assert balance >= 10, f'Balance too low: \${balance:.2f}'
" && echo " Balance: PASS" || echo " Balance: FAIL"
# 2. Offer availability
COUNT=$(vastai search offers 'reliability>0.98 num_gpus=1 rentable=true' --raw --limit 1 | python3 -c "import sys,json; print(len(json.load(sys.stdin)))")
echo " Offers available: $COUNT+ | PASS"
# 3. Docker image pullable
docker pull pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime > /dev/null 2>&1 && echo " Docker image: PASS" || echo " Docker image: FAIL"
echo "Pre-flight checks complete."
| Error | Cause | Solution |
|---|---|---|
| Insufficient balance | Credits depleted mid-job | Set up auto-top-up or balance alerts |
| Instance preempted during final epoch | Spot instance reclaimed | Use on-demand for final training stage |
| Checkpoint corrupted | Interrupted mid-save | Implement atomic checkpoint writes (save to temp, rename) |
| GPU utilization drops to 0% | Data pipeline bottleneck | Profile data loading; increase disk I/O |
For version upgrades, see vastai-upgrade-migration.
Pre-launch audit: Run the verification script, check all boxes, confirm Docker image pulls successfully, and verify at least 3 matching offers are available before starting a production training run.
Budget-safe launch: Set max_dph=2.00, auto-destroy timeout of 12 hours, and daily spend alert at $50 to prevent cost overruns.