From togetherai-skills
Deploys custom Dockerized inference workers on Together AI GPUs using Sprocket SDK and Jig CLI. Submits async queue jobs and polls results for container-level control beyond standard endpoints.
npx claudepluginhub togethercomputer/skillsThis skill uses the workspace's default tool permissions.
Use Dedicated Container Inference when the user needs a custom runtime, not just managed model
Deploys and manages single-tenant GPU endpoints on Together AI with autoscaling and no rate limits. Handles fine-tuned or uploaded models, hardware sizing, and lifecycle for stable production inference.
Runs Python code serverlessly in the cloud with containers, GPUs, and autoscaling for deploying ML models, batch processing, scheduled jobs, and GPU-accelerated APIs.
Runs Python workloads on Hugging Face Jobs with managed CPUs, GPUs, TPUs, secrets, and Hub persistence for data processing, batch inference, experiments, and scheduled tasks without local setup.
Share bugs, ideas, or general feedback.
Use Dedicated Container Inference when the user needs a custom runtime, not just managed model hosting.
Core building blocks:
together-dedicated-endpoints for standard model hosting without custom containerstogether-gpu-clusters for full cluster ownership and orchestration controltogether-chat-completions, together-images, or together-video when a serverless product already covers the taskpyproject.toml for image, runtime, autoscaling, and mounts.together>=2.0.0). If the user is on an older version, they must upgrade first: uv pip install --upgrade "together>=2.0.0".pyproject.toml as the source of truth for deployment behavior.