From claudeclaw
Switches ClaudeClaw WhatsApp voice transcription from OpenAI Whisper API to local whisper.cpp on Apple Silicon Macs. Runs on-device with no network or API key needed.
npx claudepluginhub sbusso/claudeclawThis skill uses the workspace's default tool permissions.
Switches voice transcription from OpenAI's Whisper API to local whisper.cpp. Runs entirely on-device — no API key, no network, no cost.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Switches voice transcription from OpenAI's Whisper API to local whisper.cpp. Runs entirely on-device — no API key, no network, no cost.
Channel support: Currently WhatsApp only. The transcription module (src/transcription.ts) uses Baileys types for audio download. Other channels (Telegram, Discord, etc.) would need their own audio-download logic before this skill can serve them.
Note: The Homebrew package is whisper-cpp, but the CLI binary it installs is whisper-cli.
voice-transcription skill must be applied first (WhatsApp channel)whisper-cpp installed: brew install whisper-cpp (provides the whisper-cli binary)ffmpeg installed: brew install ffmpegdata/models/Check if src/transcription.ts already uses whisper-cli:
grep 'whisper-cli' src/transcription.ts && echo "Already applied" || echo "Not applied"
If already applied, skip to Phase 3 (Verify).
whisper-cli --help >/dev/null 2>&1 && echo "WHISPER_OK" || echo "WHISPER_MISSING"
ffmpeg -version >/dev/null 2>&1 && echo "FFMPEG_OK" || echo "FFMPEG_MISSING"
If missing, install via Homebrew:
brew install whisper-cpp ffmpeg
ls data/models/ggml-*.bin 2>/dev/null || echo "NO_MODEL"
If no model exists, download the base model (148MB, good balance of speed and accuracy):
mkdir -p data/models
curl -L -o data/models/ggml-base.bin "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin"
For better accuracy at the cost of speed, use ggml-small.bin (466MB) or ggml-medium.bin (1.5GB).
git remote -v
If whatsapp is missing, add it:
git remote add whatsapp https://github.com/qwibitai/claudeclaw-whatsapp.git
git fetch whatsapp skill/local-whisper
git merge whatsapp/skill/local-whisper || {
git checkout --theirs package-lock.json
git add package-lock.json
git merge --continue
}
This modifies src/transcription.ts to use the whisper-cli binary instead of the OpenAI API.
npm run build
The ClaudeClaw launchd service runs with a restricted PATH. whisper-cli and ffmpeg are in /opt/homebrew/bin/ (Apple Silicon) or /usr/local/bin/ (Intel), which may not be in the plist's PATH.
Service name: Derived from the directory name:
com.claudeclaw.<dirname>(macOS) /claudeclaw-<dirname>(Linux). For example, if cwd ismy-assistant, the service iscom.claudeclaw.my-assistant. Determine the correct service name before running service commands below.
Check the current PATH:
grep -A1 'PATH' ~/Library/LaunchAgents/com.claudeclaw.plist
If /opt/homebrew/bin is missing, add it to the <string> value inside the PATH key in the plist. Then reload:
launchctl unload ~/Library/LaunchAgents/com.claudeclaw.plist
launchctl load ~/Library/LaunchAgents/com.claudeclaw.plist
npm run build
launchctl kickstart -k gui/$(id -u)/com.claudeclaw
Send a voice note in any registered group. The agent should receive it as [Voice: <transcript>].
tail -f logs/claudeclaw.log | grep -i -E "voice|transcri|whisper"
Look for:
Transcribed voice message — successful transcriptionwhisper.cpp transcription failed — check model path, ffmpeg, or PATHEnvironment variables (optional, set in .env):
| Variable | Default | Description |
|---|---|---|
WHISPER_BIN | whisper-cli | Path to whisper.cpp binary |
WHISPER_MODEL | data/models/ggml-base.bin | Path to GGML model file |
"whisper.cpp transcription failed": Ensure both whisper-cli and ffmpeg are in PATH. The launchd service uses a restricted PATH — see Phase 3 above. Test manually:
ffmpeg -f lavfi -i anullsrc=r=16000:cl=mono -t 1 -f wav /tmp/test.wav -y
whisper-cli -m data/models/ggml-base.bin -f /tmp/test.wav --no-timestamps -nt
Transcription works in dev but not as service: The launchd plist PATH likely doesn't include /opt/homebrew/bin. See "Ensure launchd PATH includes Homebrew" in Phase 3.
Slow transcription: The base model processes ~30s of audio in <1s on M1+. If slower, check CPU usage — another process may be competing.
Wrong language: whisper.cpp auto-detects language. To force a language, you can set WHISPER_LANG and modify src/transcription.ts to pass -l $WHISPER_LANG.