Skill

setup-voice-mode

Installs and configures VoiceMode MCP server for voice interactions in Claude Code using local Kokoro TTS and Whisper STT, with bash commands for uvx install, MCP addition, and endpoint config.

Bash

OpenAI

Anthropic

developer-tools

automation

npx claudepluginhub waterplanai/agentic-config --plugin ac-tools

Tool Access

This skill is limited to using the following tools:

BashRead

Preview

Install and configure VoiceMode MCP for voice interactions with Claude Code.

SKILL.md

Similar Skills

voicemode

1.1k

Enables natural voice conversations in Claude Code using STT/TTS via MCP tools like voicemode:converse. Handles setup, diagnostics, and voice troubleshooting.

voicemode

use-local-whisper

145

Switches ClaudeClaw WhatsApp voice transcription from OpenAI Whisper API to local whisper.cpp on Apple Silicon Macs. Runs on-device with no network or API key needed.

claudeclaw

voice

Sets up and tests agent voice backends for TTS (sag/ElevenLabs, OpenAI, macOS say) and STT (whisper-cli, OpenAI Whisper). Dispatches on /agent:voice commands and phrases like 'speak this' or 'transcribe audio'.

agent

Stats

Parent Repo Stars28

Parent Repo Forks6

Last CommitMar 6, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Setup VoiceMode for Claude Code

Install and configure VoiceMode MCP for voice interactions with Claude Code.

Steps

Install VoiceMode:

uvx voice-mode-install --yes

Add MCP server to Claude Code:

claude mcp add --scope user voicemode -- uvx --refresh voice-mode

Configure local endpoints (Kokoro TTS + Whisper STT):

voicemode config set VOICEMODE_TTS_BASE_URLS http://127.0.0.1:8880/v1
voicemode config set VOICEMODE_STT_BASE_URLS http://127.0.0.1:2022/v1
voicemode config set VOICEMODE_PREFER_LOCAL true
voicemode config set VOICEMODE_ALWAYS_TRY_LOCAL true

This is critical. Without explicit _BASE_URLS, the default includes https://api.openai.com/v1 as fallback, which crashes with OPENAI_API_KEY errors even when local services are running.

Verify installation:

claude mcp list

Test voice mode:

Restart Claude Code
Use the mcp__voicemode__converse tool

First Run Note

Kokoro TTS may take 5+ minutes to load on first run while it downloads and initializes the model (~111MB). Check status with:

voicemode service kokoro status

Two MCP restarts required:

After initial setup (step 5)
After Kokoro model finishes downloading

Without the second restart, you may get "OpenAI API key" errors even with local config.

Configuration Options

Edit config with:

voicemode config edit

List all options:

voicemode config list

Key Settings

Setting	Description
`VOICEMODE_PREFER_LOCAL`	Prefer local providers over cloud (true/false)
`VOICEMODE_ALWAYS_TRY_LOCAL`	Always attempt local providers first (true/false)
`VOICEMODE_SAVE_AUDIO`	Save audio files (true/false, default: false)
`VOICEMODE_WHISPER_MODEL`	Whisper model (tiny, base, small, medium, large-v2)
`VOICEMODE_KOKORO_DEFAULT_VOICE`	Default voice (e.g., af_sky)
`OPENAI_API_KEY`	Required only for cloud processing

Provider Options

Local-only (default, recommended): Set VOICEMODE_TTS_BASE_URLS=http://127.0.0.1:8880/v1 and VOICEMODE_STT_BASE_URLS=http://127.0.0.1:2022/v1 (no API key needed)
Cloud-only: Set OPENAI_API_KEY and set URLs to https://api.openai.com/v1
Hybrid (local-first, cloud fallback): Set OPENAI_API_KEY and set URLs to http://127.0.0.1:8880/v1,https://api.openai.com/v1 (TTS) and http://127.0.0.1:2022/v1,https://api.openai.com/v1 (STT)

Troubleshooting

OpenAI API key error: Ensure VOICEMODE_TTS_BASE_URLS and VOICEMODE_STT_BASE_URLS point to local endpoints only (step 3). The PREFER_LOCAL flag alone is NOT sufficient — it does not remove OpenAI from the fallback chain
Kokoro stuck "starting up": Wait 5+ mins on first run, or check logs: voicemode service kokoro logs
macOS M3 crash: Known issue with ggml_metal - use CPU mode
WSL audio issues: Install PulseAudio packages
Slow transcription: Use GPU acceleration or smaller Whisper model

Improved Accuracy (Optional)

The default tiny model is fast but less accurate. For better transcription:

Model	Size	Accuracy	Speed
tiny	75MB	~70%	Fastest
small	466MB	~82%	Fast
medium	1.4GB	~88%	Moderate

voicemode config set VOICEMODE_WHISPER_MODEL small
# or for best accuracy:
voicemode config set VOICEMODE_WHISPER_MODEL medium

Restart Whisper service after changing:

voicemode service whisper restart

macOS Metal GPU Acceleration (Optional)

For significantly faster transcription on Apple Silicon, convert Whisper to Core ML:

Prerequisites

# Install whisper.cpp via Homebrew
brew install whisper-cpp

# Set Whisper directory
WHISPER_DIR=~/.voicemode/services/whisper

Steps

1. Download model

cd $WHISPER_DIR/models
./download-ggml-model.sh medium

2. Install Python dependencies

pip3 install torch coremltools openai-whisper ane_transformers

3. Convert to Core ML

cd $WHISPER_DIR
./models/generate-coreml-model.sh medium

4. Update config

voicemode config set VOICEMODE_WHISPER_MODEL medium

5. Restart Whisper

voicemode service whisper restart

Verification

# Check Core ML model exists
ls -la $WHISPER_DIR/models/ggml-medium-encoder.mlmodelc

When running, logs should show: GPU: Metal, Core ML: Enabled

setup-voice-mode

Tool Access

Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

setup-voice-mode

Tool Access

Preview

SKILL.md

Setup VoiceMode for Claude Code

Steps

First Run Note

Configuration Options

Key Settings

Provider Options

Troubleshooting

Improved Accuracy (Optional)

macOS Metal GPU Acceleration (Optional)

Prerequisites

Steps

Verification

Links

Similar Skills

Help us improve

Setup VoiceMode for Claude Code

Steps

First Run Note

Configuration Options

Key Settings

Provider Options

Troubleshooting

Improved Accuracy (Optional)

macOS Metal GPU Acceleration (Optional)

Prerequisites

Steps

Verification

Links