From ac-tools
Installs and configures VoiceMode MCP server for voice interactions in Claude Code using local Kokoro TTS and Whisper STT, with bash commands for uvx install, MCP addition, and endpoint config.
npx claudepluginhub waterplanai/agentic-config --plugin ac-toolsThis skill is limited to using the following tools:
Install and configure VoiceMode MCP for voice interactions with Claude Code.
Enables natural voice conversations in Claude Code using STT/TTS via MCP tools like voicemode:converse. Handles setup, diagnostics, and voice troubleshooting.
Switches ClaudeClaw WhatsApp voice transcription from OpenAI Whisper API to local whisper.cpp on Apple Silicon Macs. Runs on-device with no network or API key needed.
Sets up and tests agent voice backends for TTS (sag/ElevenLabs, OpenAI, macOS say) and STT (whisper-cli, OpenAI Whisper). Dispatches on /agent:voice commands and phrases like 'speak this' or 'transcribe audio'.
Share bugs, ideas, or general feedback.
Install and configure VoiceMode MCP for voice interactions with Claude Code.
uvx voice-mode-install --yes
claude mcp add --scope user voicemode -- uvx --refresh voice-mode
voicemode config set VOICEMODE_TTS_BASE_URLS http://127.0.0.1:8880/v1
voicemode config set VOICEMODE_STT_BASE_URLS http://127.0.0.1:2022/v1
voicemode config set VOICEMODE_PREFER_LOCAL true
voicemode config set VOICEMODE_ALWAYS_TRY_LOCAL true
This is critical. Without explicit _BASE_URLS, the default includes https://api.openai.com/v1 as fallback, which crashes with OPENAI_API_KEY errors even when local services are running.
claude mcp list
mcp__voicemode__converse toolKokoro TTS may take 5+ minutes to load on first run while it downloads and initializes the model (~111MB). Check status with:
voicemode service kokoro status
Two MCP restarts required:
Without the second restart, you may get "OpenAI API key" errors even with local config.
Edit config with:
voicemode config edit
List all options:
voicemode config list
| Setting | Description |
|---|---|
VOICEMODE_PREFER_LOCAL | Prefer local providers over cloud (true/false) |
VOICEMODE_ALWAYS_TRY_LOCAL | Always attempt local providers first (true/false) |
VOICEMODE_SAVE_AUDIO | Save audio files (true/false, default: false) |
VOICEMODE_WHISPER_MODEL | Whisper model (tiny, base, small, medium, large-v2) |
VOICEMODE_KOKORO_DEFAULT_VOICE | Default voice (e.g., af_sky) |
OPENAI_API_KEY | Required only for cloud processing |
VOICEMODE_TTS_BASE_URLS=http://127.0.0.1:8880/v1 and VOICEMODE_STT_BASE_URLS=http://127.0.0.1:2022/v1 (no API key needed)OPENAI_API_KEY and set URLs to https://api.openai.com/v1OPENAI_API_KEY and set URLs to http://127.0.0.1:8880/v1,https://api.openai.com/v1 (TTS) and http://127.0.0.1:2022/v1,https://api.openai.com/v1 (STT)VOICEMODE_TTS_BASE_URLS and VOICEMODE_STT_BASE_URLS point to local endpoints only (step 3). The PREFER_LOCAL flag alone is NOT sufficient — it does not remove OpenAI from the fallback chainvoicemode service kokoro logsThe default tiny model is fast but less accurate. For better transcription:
| Model | Size | Accuracy | Speed |
|---|---|---|---|
| tiny | 75MB | ~70% | Fastest |
| small | 466MB | ~82% | Fast |
| medium | 1.4GB | ~88% | Moderate |
voicemode config set VOICEMODE_WHISPER_MODEL small
# or for best accuracy:
voicemode config set VOICEMODE_WHISPER_MODEL medium
Restart Whisper service after changing:
voicemode service whisper restart
For significantly faster transcription on Apple Silicon, convert Whisper to Core ML:
# Install whisper.cpp via Homebrew
brew install whisper-cpp
# Set Whisper directory
WHISPER_DIR=~/.voicemode/services/whisper
1. Download model
cd $WHISPER_DIR/models
./download-ggml-model.sh medium
2. Install Python dependencies
pip3 install torch coremltools openai-whisper ane_transformers
3. Convert to Core ML
cd $WHISPER_DIR
./models/generate-coreml-model.sh medium
4. Update config
voicemode config set VOICEMODE_WHISPER_MODEL medium
5. Restart Whisper
voicemode service whisper restart
# Check Core ML model exists
ls -la $WHISPER_DIR/models/ggml-medium-encoder.mlmodelc
When running, logs should show: GPU: Metal, Core ML: Enabled