From togetherai-skills
Provides text-to-speech via REST, streaming, and realtime WebSocket, plus speech-to-text transcription, translation, diarization, timestamps, and live STT using Together AI APIs.
npx claudepluginhub togethercomputer/skillsThis skill uses the workspace's default tool permissions.
Use Together AI audio APIs for:
Implements ElevenLabs TTS with voice settings, instant voice cloning from audio samples, and WebSocket streaming. For building voice generation features.
Handles fal.ai STT/TTS: Whisper transcription/translation with timestamps, voice cloning via F5-TTS/XTTS/ElevenLabs, Kokoro multi-lang TTS, SRT subtitles. Provides endpoints, params, TS/Python code.
Provides text-to-speech, speech-to-text with diarization, voice conversion, and audio processing via EachLabs API using ElevenLabs TTS, Whisper, and RVC models. Useful for TTS, transcription, or voice tasks.
Share bugs, ideas, or general feedback.
Use Together AI audio APIs for:
together-chat-completions for text-only generationtogether-video or together-images for visual generation workflowstogether-dedicated-endpoints only when the audio model itself must be hosted on dedicated infrastructuretogether>=2.0.0). If the user is on an older version, they must upgrade first: uv pip install --upgrade "together>=2.0.0".client.audio.speech.create() for TTS.BinaryAPIResponse; call response.write_to_file(path) to save it. Do NOT use stream_to_file (it does not exist on this object).stream=True) returns a Stream of AudioSpeechStreamChunk objects. Iterate chunks, check chunk.type, and decode base64.b64decode(chunk.delta) for audio data. There is no file-writing helper on the stream object.client.audio.transcriptions.create() for transcription and client.audio.translations.create() for translation.