Plugin

togetherai-skills

Name: togetherai-skills
Author: togethercomputer

Build AI agents using Together AI skills to run streaming inference with tool calling, fine-tune models, generate embeddings/images/videos/audio, transcribe speech, evaluate LLMs, execute remote Python, deploy GPU endpoints/clusters, and manage batch jobs/infrastructure.

npx claudepluginhub togethercomputer/skills

Component Overview

Skills

Component Details

Skills (12)

together-audio

/together-audio

Text-to-speech and speech-to-text via Together AI, including REST, streaming, and realtime WebSocket TTS, plus transcription, translation, diarization, timestamps, and live STT. Reach for it whenever the user needs audio in or audio out on Together AI rather than chat generation, image or video creation, or model training.

together-batch-inference

/together-batch-inference

High-volume, asynchronous offline inference at up to 50% lower cost via Together AI's Batch API. Prepare JSONL inputs, upload files, create jobs, poll status, and download outputs. Reach for it whenever the user needs non-interactive bulk inference rather than real-time chat or evaluation jobs.

together-chat-completions

/together-chat-completions

Real-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-turn conversations, tool and function calling, structured JSON outputs, and reasoning models. Reach for it whenever the user wants to build or debug text generation on Together AI, unless they specifically need batch jobs, embeddings, fine-tuning, dedicated endpoints, dedicated containers, or GPU clusters.

together-dedicated-containers

/together-dedicated-containers

Custom Dockerized inference workers on Together AI's managed GPU infrastructure. Build with Sprocket SDK, configure with Jig CLI, submit async queue jobs, and poll results. Reach for it whenever the user needs container-level control rather than a standard model endpoint or raw cluster.

together-dedicated-endpoints

/together-dedicated-endpoints

Single-tenant GPU endpoints on Together AI with autoscaling and no rate limits. Deploy fine-tuned or uploaded models, size hardware, and manage endpoint lifecycle. Reach for it whenever the user needs predictable always-on hosting rather than serverless inference, custom containers, or raw clusters.

together-embeddings

/together-embeddings

Dense vector embeddings, semantic search, RAG pipelines, and reranking via Together AI. Generate embeddings with open-source models and rerank results behind dedicated endpoints. Reach for it whenever the user needs vector representations or retrieval quality improvements rather than direct text generation.

together-evaluations

/together-evaluations

LLM-as-a-judge evaluation framework on Together AI. Classify, score, and compare model outputs, select judge models, use external-provider judges or targets, poll results and download reports. Reach for it whenever the user wants to benchmark outputs, grade responses, compare A/B variants, or operationalize automated evaluations.

together-fine-tuning

/together-fine-tuning

LoRA, full fine-tuning, DPO preference tuning, VLM training, function-calling tuning, reasoning tuning, and BYOM uploads on Together AI. Reach for it whenever the user wants to adapt a model on custom data rather than only run inference, evaluate outputs, or host an existing model.

together-gpu-clusters

/together-gpu-clusters

On-demand and reserved GPU clusters (H100, H200, B200) on Together AI with Kubernetes or Slurm orchestration, shared storage, credential management, and cluster scaling for ML and HPC jobs. Reach for it when the user needs multi-node compute or infrastructure control rather than a managed model endpoint.

together-images

/together-images

Text-to-image generation and image editing via Together AI, including FLUX and Kontext models, LoRA-based styling, reference-image guidance, and local image downloads. Reach for it whenever the user wants to generate or edit images on Together AI rather than create videos or build text-only chat applications.

together-sandboxes

/together-sandboxes

Remote Python execution in managed sandboxes on Together AI with stateful sessions, file uploads, data analysis, chart generation, and notebook-like runs via the Sandboxes API. Reach for it whenever the user wants managed remote Python execution instead of local execution, raw clusters, or full model hosting.

together-video

/together-video

Text-to-video and image-to-video generation via Together AI, including keyframe control, model and dimension selection, asynchronous job polling, and video downloads. Reach for it whenever the user wants motion generation on Together AI rather than still-image generation or text-only inference.

README

Together AI Skills for Coding Agents

A collection of 12 agent skills that provide comprehensive knowledge of the Together AI platform — inference, training, embeddings, audio, video, images, function calling, and infrastructure.

Each skill teaches AI coding agents how to use a specific Together AI product, including API patterns, SDK usage (Python and TypeScript), CLI commands, direct API usage, model selection, and best practices. Skills include runnable Python scripts (using the Together Python v2 SDK), TypeScript examples, and CLI/API workflow guidance.

Compatible with Claude Code, Cursor, Codex, and Gemini CLI.

What Are Skills?

Skills are markdown instruction files that give AI coding agents domain-specific knowledge. When an agent detects that a skill is relevant to your task, it loads the skill's instructions and uses them to write better code.

Each skill contains:

SKILL.md — Lean routing guidance for the agent: when to use the skill, when to hand off, and where to look next
references/ — Detailed reference docs (model lists, API parameters, CLI commands)
scripts/ — Runnable Python scripts demonstrating complete workflows
agents/openai.yaml — Optional UI metadata for OpenAI/Codex surfaces

Skills Overview

Skill	Description	Scripts
together-chat-completions	Real-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-tur...	`async_parallel.py`, `chat_basic.py`, `debug_headers.py`, `reasoning_models.py`, `structured_outputs.py`, `tool_call_loop.py`
together-images	Text-to-image generation and image editing via Together AI, including FLUX and Kontext models, LoRA-based styling, re...	`generate_image.py`, `kontext_editing.py`, `lora_generation.py`
together-video	Text-to-video and image-to-video generation via Together AI, including keyframe control, model and dimension selectio...	`generate_video.py`, `image_to_video.py`
together-audio	Text-to-speech and speech-to-text via Together AI, including REST, streaming, and realtime WebSocket TTS, plus transc...	`stt_realtime.py`, `stt_transcribe.py`, `tts_generate.py`, `tts_websocket.py`
together-embeddings	Dense vector embeddings, semantic search, RAG pipelines, and reranking via Together AI.	`embed_and_rerank.py`, `rag_pipeline.py`, `semantic_search.py`
together-fine-tuning	LoRA, full fine-tuning, DPO preference tuning, VLM training, function-calling tuning, reasoning tuning, and BYOM uplo...	`dpo_workflow.py`, `finetune_workflow.py`, `function_calling_finetune.py`, `reasoning_finetune.py`, `vlm_finetune.py`
together-batch-inference	High-volume, asynchronous offline inference at up to 50% lower cost via Together AI's Batch API.	`batch_workflow.py`
together-evaluations	LLM-as-a-judge evaluation framework on Together AI.	`run_evaluation.py`
together-sandboxes	Remote Python execution in managed sandboxes on Together AI with stateful sessions, file uploads, data analysis, char...	`execute_with_session.py`
together-dedicated-endpoints	Single-tenant GPU endpoints on Together AI with autoscaling and no rate limits.	`deploy_finetuned.py`, `manage_endpoint.py`, `upload_custom_model.py`
together-dedicated-containers	Custom Dockerized inference workers on Together AI's managed GPU infrastructure.	`queue_client.py`, `sprocket_hello_world.py`
together-gpu-clusters	On-demand and reserved GPU clusters (H100, H200, B200) on Together AI with Kubernetes or Slurm orchestration, shared ...	`manage_cluster.py`, `manage_storage.py`

Installation

Quick Install (Any Agent)

Install all skills at once using skills.sh:

npx skills add togethercomputer/skills

This works with Claude Code, Cursor, Codex, and other agents that support the Agent Skills specification.

Claude Code

cp -r skills/together-* your-project/.claude/skills/
# Global availability
cp -r skills/together-* ~/.claude/skills/

Marketplace plugin coming soon.

Cursor

cp -r skills/together-* your-project/.cursor/skills/

Cursor plugin marketplace listing coming soon.

Codex

cp -r skills/together-* your-project/.agents/skills/

Gemini CLI

gemini extensions install https://github.com/togethercomputer/skills.git --consent

Verify installation

# Claude Code
ls your-project/.claude/skills/together-*/SKILL.md
# Codex
ls your-project/.agents/skills/together-*/SKILL.md

You should see one SKILL.md per installed skill.

Usage

Once installed, skills activate automatically when the agent detects a relevant task. No explicit invocation is needed.

View full README on GitHub

Similar Plugins

together-pack

1.9k

Claude Code skill pack for Together AI (18 skills)

1mo

v1.0.0

Stats

Version1.0.0

Stars22

Forks4

MaintenanceExcellent

LicenseMIT

AddedApr 1, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Help us improve

Share bugs, ideas, or general feedback.

Back to Plugins

Together AI Skills for Coding Agents

A collection of 12 agent skills that provide comprehensive knowledge of the Together AI platform — inference, training, embeddings, audio, video, images, function calling, and infrastructure.

Compatible with Claude Code, Cursor, Codex, and Gemini CLI.

What Are Skills?

Each skill contains:

SKILL.md — Lean routing guidance for the agent: when to use the skill, when to hand off, and where to look next
references/ — Detailed reference docs (model lists, API parameters, CLI commands)
scripts/ — Runnable Python scripts demonstrating complete workflows
agents/openai.yaml — Optional UI metadata for OpenAI/Codex surfaces

Skills Overview

Skill	Description	Scripts
together-chat-completions	Real-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-tur...	`async_parallel.py`, `chat_basic.py`, `debug_headers.py`, `reasoning_models.py`, `structured_outputs.py`, `tool_call_loop.py`
together-images	Text-to-image generation and image editing via Together AI, including FLUX and Kontext models, LoRA-based styling, re...	`generate_image.py`, `kontext_editing.py`, `lora_generation.py`
together-video	Text-to-video and image-to-video generation via Together AI, including keyframe control, model and dimension selectio...	`generate_video.py`, `image_to_video.py`
together-audio	Text-to-speech and speech-to-text via Together AI, including REST, streaming, and realtime WebSocket TTS, plus transc...	`stt_realtime.py`, `stt_transcribe.py`, `tts_generate.py`, `tts_websocket.py`
together-embeddings	Dense vector embeddings, semantic search, RAG pipelines, and reranking via Together AI.	`embed_and_rerank.py`, `rag_pipeline.py`, `semantic_search.py`
together-fine-tuning	LoRA, full fine-tuning, DPO preference tuning, VLM training, function-calling tuning, reasoning tuning, and BYOM uplo...	`dpo_workflow.py`, `finetune_workflow.py`, `function_calling_finetune.py`, `reasoning_finetune.py`, `vlm_finetune.py`
together-batch-inference	High-volume, asynchronous offline inference at up to 50% lower cost via Together AI's Batch API.	`batch_workflow.py`
together-evaluations	LLM-as-a-judge evaluation framework on Together AI.	`run_evaluation.py`
together-sandboxes	Remote Python execution in managed sandboxes on Together AI with stateful sessions, file uploads, data analysis, char...	`execute_with_session.py`
together-dedicated-endpoints	Single-tenant GPU endpoints on Together AI with autoscaling and no rate limits.	`deploy_finetuned.py`, `manage_endpoint.py`, `upload_custom_model.py`
together-dedicated-containers	Custom Dockerized inference workers on Together AI's managed GPU infrastructure.	`queue_client.py`, `sprocket_hello_world.py`
together-gpu-clusters	On-demand and reserved GPU clusters (H100, H200, B200) on Together AI with Kubernetes or Slurm orchestration, shared ...	`manage_cluster.py`, `manage_storage.py`

Installation

Quick Install (Any Agent)

Install all skills at once using skills.sh:

npx skills add togethercomputer/skills

This works with Claude Code, Cursor, Codex, and other agents that support the Agent Skills specification.

Claude Code

cp -r skills/together-* your-project/.claude/skills/
# Global availability
cp -r skills/together-* ~/.claude/skills/

Marketplace plugin coming soon.

Cursor

cp -r skills/together-* your-project/.cursor/skills/

Cursor plugin marketplace listing coming soon.

Codex

cp -r skills/together-* your-project/.agents/skills/

Gemini CLI

gemini extensions install https://github.com/togethercomputer/skills.git --consent

Verify installation

# Claude Code
ls your-project/.claude/skills/together-*/SKILL.md
# Codex
ls your-project/.agents/skills/together-*/SKILL.md

You should see one SKILL.md per installed skill.

Usage

Once installed, skills activate automatically when the agent detects a relevant task. No explicit invocation is needed.

togetherai-skills

Component Overview

Component Details

Skills (12)

README

Together AI Skills for Coding Agents

What Are Skills?

Skills Overview

Installation

Quick Install (Any Agent)

Claude Code

Cursor

Codex

Gemini CLI

Verify installation

Usage

Similar Plugins

together-pack

Help us improve

Help us improve

togetherai-skills

Component Overview

Component Details

Skills (12)

README

Together AI Skills for Coding Agents

What Are Skills?

Skills Overview

Installation

Quick Install (Any Agent)

Claude Code

Cursor

Codex

Gemini CLI

Verify installation

Usage

Similar Plugins

together-pack

Help us improve

replicate

itsmostafa-llm-engineering-skills

caveman

ui-design

frontend-design