Claude-to-Speech
Voice-first interaction mode for Claude Code with automatic text-to-speech via ElevenLabs.
Overview
Claude-to-Speech is a plugin that enables automatic voice output for Claude Code responses. Instead of manually triggering TTS, Claude includes invisible markers in responses that are automatically extracted and spoken by a Stop hook.
Features
- Automatic TTS: Claude's responses are spoken automatically via TTS markers
- Smart Defaults: Silent for code dumps, vocal for questions and confirmations
- Multiple Voice Options: Choose from ElevenLabs voices or use custom voice IDs
- Dual Mode Support:
- Direct ElevenLabs API (no server required)
- Local TTS server integration
- Deduplication: Prevents repeated messages within 2-second window
- Cross-Platform: Works on macOS, Linux (including Raspberry Pi), and Windows
Installation
Prerequisites
- Claude Code 2.0+
- ElevenLabs API key (get one here)
- Python 3.7+
requests library: pip install requests
- (Optional)
python-dotenv: pip install python-dotenv
Via Claude Code Plugin System
- Clone or download this repository
- Add to your Claude Code plugins directory:
mkdir -p ~/.claude/plugins/repos
cd ~/.claude/plugins/repos
git clone https://github.com/yourusername/claude-to-speech.git
- Install the plugin:
claude plugin install ./claude-to-speech
- Configure your
.env file (see Configuration below)
- Restart Claude Code
Manual Installation
- Copy the plugin directory to your Claude Code plugins location
- Create a
.env file based on .env.example
- Add your ElevenLabs API key
- Run
/plugin in Claude Code to refresh
- Restart Claude Code
Configuration
Create a .env file in the plugin root directory:
# REQUIRED: ElevenLabs API Key
ELEVENLABS_API_KEY=your_api_key_here
# Voice ID (optional - defaults to Claude voice)
# Available names: laura, claude, rachel, domi, bella, antoni, arnold, adam, josh
# Or use a raw ElevenLabs voice ID
CLAUDE_VOICE_ID=claude
# ElevenLabs Model (optional - defaults to eleven_flash_v2_5)
# Options: eleven_flash_v2_5 (fastest), eleven_turbo_v2, eleven_multilingual_v2
ELEVENLABS_MODEL=eleven_flash_v2_5
# TTS Server URL (optional - leave empty for direct API mode)
# If you have a local TTS server, specify it here
TTS_SERVER_URL=
# Debug mode (optional - set to 1 to enable debug logging)
DEBUG=0
Voice Options
The plugin includes these pre-configured voices:
claude / assistant - British male voice (default)
laura - American female voice
rachel - Calm female
domi - Confident female
bella - Soft female
antoni - Well-rounded male
arnold - Strong male
adam - Deep male
josh - Young male
You can also use any ElevenLabs voice ID directly.
TTS Server Mode vs Direct API Mode
The plugin supports two operational modes:
Direct API Mode (Default)
When to use: Simple setup, single-user, occasional TTS use
- Calls ElevenLabs API directly from the plugin
- No additional server setup required
- Each TTS request goes through the internet to ElevenLabs
- Best for: Getting started, testing, low-volume usage
Configuration:
TTS_SERVER_URL= # Leave empty
Local TTS Server Mode (Recommended for Power Users)
When to use: Multi-device setup, high-volume usage, local network integration
- Runs a persistent TTS server on your local network
- Multiple devices can share the same server (desktop, mobile, Raspberry Pi)
- Audio caching reduces API calls and speeds up repeated phrases
- Centralized voice configuration across all clients
- Lower latency for local playback
- Enables offline caching for frequently used phrases
- Best for: LAURA-style multi-device AI systems, development, production use
Configuration:
TTS_SERVER_URL=http://localhost:5001/tts # Or your server IP
Setting up a TTS server:
The plugin includes scripts/tts_server.py - a Flask-based TTS server:
# Install dependencies
pip install flask requests
# Run the server
cd scripts
python3 tts_server.py
The server listens on http://0.0.0.0:5001 by default. Point multiple Claude Code instances, mobile apps, or other devices to this server for centralized TTS.
Benefits for LAURA-style systems:
- Consistency: Same voice across desktop, mobile, and embedded devices
- Efficiency: Cached audio for common responses ("I don't understand", "Working on it", etc.)
- Scalability: One API key serves multiple devices
- Control: Centralized voice/model switching without reconfiguring clients
Usage
Enable Voice Mode
Run the /claude-to-speech:speak command (or /speak for short):
/speak
This activates voice-first mode where Claude will include TTS markers in responses.
How It Works