Help us improve
Share bugs, ideas, or general feedback.
AI-assisted inference on NVIDIA DGX Spark - run, manage, and stop LLM workloads
npx claudepluginhub spark-arena/sparkrunAI-assisted inference on NVIDIA DGX Spark - run, manage, and stop LLM workloads
Share bugs, ideas, or general feedback.
Launch, manage, and stop LLM inference workloads on one or more NVIDIA DGX Spark systems — no Slurm, no Kubernetes, no fuss.
Documentation · Quick Start · Recipes · Spark Arena
uvx sparkrun setup
One command — installs sparkrun, then launches the guided setup wizard to create a cluster, configure SSH mesh, detect ConnectX-7 NICs, set up sudoers, and enable earlyoom.
# Run an inference workload
sparkrun run qwen3-1.7b-vllm
# Multi-node tensor parallelism (TP maps to node count on DGX Spark)
sparkrun run qwen3-1.7b-vllm --tp 2
# Re-attach to logs, stop a workload, check status
sparkrun logs qwen3-1.7b-vllm
sparkrun stop qwen3-1.7b-vllm
sparkrun status
Ctrl+C detaches from logs — it never kills your inference job. Your model keeps serving.
See the full CLI reference for all commands and options.
--tp 2 = 2 hosts, automatic InfiniBand/RDMA detectionsparkrun show <recipe>)Spark Arena is the community hub for DGX Spark recipe benchmarks — browse benchmark results, then run them directly with sparkrun.
Official Recipes are maintained by the Spark Arena team and hosted on GitHub. They are tested and optimized for NVIDIA DGX Spark systems.
Community Recipes are contributed by the community and hosted on GitHub.
Apache License 2.0 — see LICENSE for details.