Help us improve
Share bugs, ideas, or general feedback.
From ptt
Use when the user wants to set up whisper for PTT (push-to-talk) voice input. Guides through choosing API vs local mode and configuring whisper.cpp if local.
npx claudepluginhub aaddrick/claude-ptt --plugin pttHow this skill is triggered — by the user, by Claude, or both
Slash command
/ptt:whisper-setupThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill guides users through setting up Whisper for the PTT plugin.
Switches voice transcription from OpenAI Whisper API to local whisper.cpp on Apple Silicon. Currently WhatsApp-only. Requires voice-transcription skill first.
Switches voice transcription from OpenAI Whisper API to local whisper.cpp on Apple Silicon for WhatsApp channels. On-device, no network or API key needed.
Installs and configures VoiceMode MCP server for voice interactions in Claude Code using local Kokoro TTS and Whisper STT, with bash commands for uvx install, MCP addition, and endpoint config.
Share bugs, ideas, or general feedback.
This skill guides users through setting up Whisper for the PTT plugin.
The PTT plugin supports two transcription backends:
Ask the user which mode they prefer:
Which Whisper mode would you like to set up?
1. **OpenAI API** (Recommended for ease of use)
- Requires OpenAI API key
- Costs ~$0.006 per minute of audio
- Best transcription quality
- Requires internet connection
2. **Local whisper.cpp** (Recommended for privacy)
- Free, no API costs
- Works offline
- Requires ~150MB-3GB disk space (depending on model)
- Transcription speed depends on your hardware
If user chooses API:
Check if OPENAI_API_KEY environment variable is set:
echo $OPENAI_API_KEY | head -c 10
If not set, ask user to provide their API key
Update config:
# Read current config and update
cat ~/.claude/ptt-config.json
Set whisper.openaiApiKey to the user's key or instruct them to set OPENAI_API_KEY env var.
Set whisper.preferredMode to "api"
If user chooses local:
Check available RAM:
free -h
Check available disk space:
df -h ~
Check CPU info:
lscpu | grep -E "(Model name|CPU\(s\)|Thread)"
Check for NVIDIA GPU (for CUDA acceleration):
nvidia-smi 2>/dev/null || echo "No NVIDIA GPU detected"
Present model options with recommendations based on system:
| Model | Size | RAM Required | Speed | Quality | Best For |
|---|---|---|---|---|---|
| tiny.en | 75MB | ~400MB | Fastest | Basic | Low-resource systems, quick tests |
| base.en | 142MB | ~500MB | Fast | Good | Most desktop systems (RECOMMENDED) |
| small.en | 466MB | ~1GB | Medium | Better | Systems with 8GB+ RAM |
| medium.en | 1.5GB | ~2.5GB | Slow | Great | Systems with 16GB+ RAM |
| large-v3 | 3GB | ~4GB | Slowest | Best | High-end systems, accuracy critical |
Recommendations:
tiny.enbase.en (default recommendation)small.en for better qualitymedium.en or large-v3 if accuracy is criticalInstall dependencies:
sudo apt-get update && sudo apt-get install -y build-essential cmake
Clone and build:
cd ~ && git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp && make -j$(nproc)
Download chosen model:
./models/download-ggml-model.sh <model_name>
Replace <model_name> with: tiny.en, base.en, small.en, medium.en, or large-v3
Test the installation:
./build/bin/whisper-cli -m models/ggml-<model>.bin -f samples/jfk.wav
Update ~/.claude/ptt-config.json:
{
"whisper": {
"localModelPath": "/home/<user>/whisper.cpp/models/ggml-<model>.bin",
"whisperExecutable": "/home/<user>/whisper.cpp/build/bin/whisper-cli",
"preferredMode": "local"
}
}
Verify the setup works:
# Record a short test
arecord -f S16_LE -r 16000 -c 1 -d 3 /tmp/test.wav
# Transcribe
~/whisper.cpp/build/bin/whisper-cli -m ~/whisper.cpp/models/ggml-base.en.bin -f /tmp/test.wav
Ask if user wants fallback enabled:
enableFallback: true"whisper-cli not found"
"Model file not found"
"API key invalid"
"Out of memory" during local transcription
make without additional flagsmake GGML_CUDA=1