Skill

together-audio

Provides text-to-speech via REST, streaming, and realtime WebSocket, plus speech-to-text transcription, translation, diarization, timestamps, and live STT using Together AI APIs.

Python

Typescript

Websocket

ai-ml

npx claudepluginhub togethercomputer/skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Use Together AI audio APIs for:

Supporting Assets

agents/openai.yamlreferences/stt-models.mdreferences/tts-models.mdscripts/stt_realtime.pyscripts/stt_transcribe.pyscripts/stt_transcribe.tsscripts/tts_generate.pyscripts/tts_generate.tsscripts/tts_websocket.py

SKILL.md

Similar Skills

elevenlabs-core-workflow-a

1.9k

Implements ElevenLabs TTS with voice settings, instant voice cloning from audio samples, and WebSocket streaming. For building voice generation features.

6 tools

elevenlabs-pack

fal-audio

Handles fal.ai STT/TTS: Whisper transcription/translation with timestamps, voice cloning via F5-TTS/XTTS/ElevenLabs, Kokoro multi-lang TTS, SRT subtitles. Provides endpoints, params, TS/Python code.

fal-ai-master

eachlabs-voice-audio

Provides text-to-speech, speech-to-text with diarization, voice conversion, and audio processing via EachLabs API using ElevenLabs TTS, Whisper, and RVC models. Useful for TTS, transcription, or voice tasks.

1 file

eachlabs-skills

Stats

Stars22

Forks4

Last CommitMar 30, 2026

Used By2 plugins

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Together Audio

Overview

Use Together AI audio APIs for:

text-to-speech generation
streaming or realtime voice output
speech-to-text transcription
translation, diarization, and timestamps
live captioning and realtime transcription

When This Skill Wins

Generate spoken audio from text
Transcribe uploaded audio files or URLs
Add realtime voice or captioning to an app
Extract speaker segments or word timings

Hand Off To Another Skill

Use together-chat-completions for text-only generation
Use together-video or together-images for visual generation workflows
Use together-dedicated-endpoints only when the audio model itself must be hosted on dedicated infrastructure

Quick Routing

REST TTS or streaming TTS
- Read references/tts-models.md
- Start with scripts/tts_generate.py or scripts/tts_generate.ts
Realtime TTS over WebSocket
- Read references/tts-models.md
- Start with scripts/tts_websocket.py
File transcription, translation, diarization, or timestamps
- Read references/stt-models.md
- Start with scripts/stt_transcribe.py or scripts/stt_transcribe.ts
Realtime STT
- Read references/stt-models.md
- Start with scripts/stt_realtime.py

Workflow

Confirm whether the task is TTS or STT.
Choose REST, streaming, or realtime transport based on latency and interaction needs.
Pick the model and response format from the relevant reference file.
Start from the matching script instead of rebuilding the request contract from memory.
For Python STT uploads, open audio files in binary mode and pass the file handle rather than a bare path string.

High-Signal Rules

Python scripts require the Together v2 SDK (together>=2.0.0). If the user is on an older version, they must upgrade first: uv pip install --upgrade "together>=2.0.0".
Use client.audio.speech.create() for TTS.
REST TTS returns a BinaryAPIResponse; call response.write_to_file(path) to save it. Do NOT use stream_to_file (it does not exist on this object).
Streaming TTS (stream=True) returns a Stream of AudioSpeechStreamChunk objects. Iterate chunks, check chunk.type, and decode base64.b64decode(chunk.delta) for audio data. There is no file-writing helper on the stream object.
Use client.audio.transcriptions.create() for transcription and client.audio.translations.create() for translation.
Realtime APIs require audio-format discipline; confirm PCM expectations before streaming bytes.
Diarization and word timestamps change response shape; code for the richer verbose output explicitly.

Resource Map

TTS reference: references/tts-models.md
STT reference: references/stt-models.md
Python TTS workflow: scripts/tts_generate.py
TypeScript TTS workflow: scripts/tts_generate.ts
Python realtime TTS workflow: scripts/tts_websocket.py
Python STT workflow: scripts/stt_transcribe.py
TypeScript STT workflow: scripts/stt_transcribe.ts
Python realtime STT workflow: scripts/stt_realtime.py

together-audio

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

Help us improve

Help us improve

together-audio

Tool Access

Preview

Supporting Assets

SKILL.md

Together Audio

Overview

When This Skill Wins

Hand Off To Another Skill

Quick Routing

Workflow

High-Signal Rules

Resource Map

Official Docs

Similar Skills

Help us improve

Together Audio

Overview

When This Skill Wins

Hand Off To Another Skill

Quick Routing

Workflow

High-Signal Rules

Resource Map

Official Docs