From Audio & Voice
Generates text-to-speech audio, clones voices from samples, and creates sound effects using the ElevenLabs API. Useful for voice-enabling content.
How this skill is triggered — by the user, by Claude, or both
Slash command
/audio-voice:elevenlabsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Expert skill for text-to-speech, voice cloning, sound effects, and audio AI using ElevenLabs - the most advanced voice AI platform.
Expert skill for text-to-speech, voice cloning, sound effects, and audio AI using ElevenLabs - the most advanced voice AI platform.
# API ключи: ~/.claude/.credentials.master.env
# Переменная: ELEVENLABS_API_KEY
ELEVENLABS_API_KEY=os.getenv('ELEVENLABS_API_KEY')
Use a prebuilt ElevenLabs voice, or set ELEVENLABS_VOICE_ID to your own (cloned) voice ID. List available voices via the /v2/voices endpoint.
Best for:
Advantages:
pip install elevenlabs
| Model | ID | Latency | Best For |
|---|---|---|---|
| Eleven v3 | eleven_v3 | Higher | Highest quality, emotions |
| Multilingual v2 | eleven_multilingual_v2 | Medium | 29 languages, expressive |
| Flash v2.5 | eleven_flash_v2_5 | ~75ms | Real-time apps |
| Turbo v2.5 | eleven_turbo_v2_5 | ~250ms | Balance quality/speed |
from elevenlabs import ElevenLabs
import os
client = ElevenLabs(api_key=os.getenv('ELEVENLABS_API_KEY'))
def text_to_speech(text: str, voice_id: str = "JBFqnCBsd6RMkjVDRZzb",
model: str = "eleven_multilingual_v2"):
"""Generate speech from text."""
audio = client.text_to_speech.convert(
text=text,
voice_id=voice_id,
model_id=model,
output_format="mp3_44100_128",
voice_settings={
"stability": 0.5,
"similarity_boost": 0.75,
"style": 0,
"use_speaker_boost": True
}
)
# Save to file
with open("output.mp3", "wb") as f:
for chunk in audio:
f.write(chunk)
return "output.mp3"
def stream_speech(text: str, voice_id: str):
"""Stream speech for real-time playback."""
audio_stream = client.text_to_speech.stream(
text=text,
voice_id=voice_id,
model_id="eleven_flash_v2_5", # Best for streaming
output_format="mp3_44100_128"
)
# Play directly
from elevenlabs.play import play
play(audio_stream)
from io import BytesIO
def clone_voice(name: str, audio_files: list):
"""
Clone voice from audio samples.
Requirements:
- Minimum 1 minute of clean audio
- Best results with 2-3 minutes
"""
files = [BytesIO(open(f, "rb").read()) for f in audio_files]
voice = client.voices.ivc.create(
name=name,
files=files,
remove_background_noise=True
)
return voice.voice_id
def generate_sound_effect(description: str, duration: int = 10):
"""
Generate sound effect from text description.
Max duration: 22 seconds
"""
audio = client.text_to_sound_effects.convert(
text=description,
duration_seconds=duration,
prompt_influence=0.5
)
with open("effect.mp3", "wb") as f:
for chunk in audio:
f.write(chunk)
return "effect.mp3"
# Examples:
# "Cinematic braam for movie trailer"
# "Forest ambience with birds chirping"
# "Spaceship engine humming"
def change_voice(audio_path: str, target_voice_id: str):
"""Convert voice in audio to target voice."""
with open(audio_path, "rb") as audio_file:
result = client.speech_to_speech.convert(
voice_id=target_voice_id,
audio=audio_file,
model_id="eleven_multilingual_sts_v2",
remove_background_noise=True
)
with open("voice_changed.mp3", "wb") as f:
for chunk in result:
f.write(chunk)
return "voice_changed.mp3"
def get_voices():
"""Get all available voices."""
voices = client.voices.search(page_size=100)
for voice in voices.voices:
print(f"{voice.name}: {voice.voice_id}")
return voices.voices
| Parameter | Range | Description |
|---|---|---|
| stability | 0-1 | Voice consistency (0.5 recommended) |
| similarity_boost | 0-1 | Voice matching (0.75 recommended) |
| style | 0-1 | Style exaggeration (0 recommended) |
| use_speaker_boost | bool | Enhance voice similarity |
mp3_22050_32 - Low qualitymp3_44100_128 - Standard (default)mp3_44100_192 - High quality (Creator+)pcm_16000, pcm_44100 - Raw PCMopus_48000_64 - Opus codec| Plan | Cost | Credits |
|---|---|---|
| Free | $0/mo | 10,000 chars |
| Starter | $5/mo | 30,000 chars |
| Creator | $22/mo | 100,000 chars |
| Pro | $99/mo | 500,000 chars |
| Task | Code |
|---|---|
| TTS | client.text_to_speech.convert(text, voice_id) |
| Stream | client.text_to_speech.stream(text, voice_id) |
| Clone voice | client.voices.ivc.create(name, files) |
| Sound effects | client.text_to_sound_effects.convert(text) |
| Voice change | client.speech_to_speech.convert(voice_id, audio) |
stability=0.5 - оптимальный баланс эмоцийnpx claudepluginhub jhamidun/claude-code-config-pack --plugin audio-voiceBuilds a throwaway prototype to answer a design question about UI appearance or state/logic behavior. Guides you through two branches: interactive terminal app for logic validation, or multiple UI variations for visual exploration.