Skill

use-local-whisper

Switches ClaudeClaw WhatsApp voice transcription from OpenAI Whisper API to local whisper.cpp on Apple Silicon Macs. Runs on-device with no network or API key needed.

Bash

Git

npx claudepluginhub sbusso/claudeclaw

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Switches voice transcription from OpenAI's Whisper API to local whisper.cpp. Runs entirely on-device — no API key, no network, no cost.

SKILL.md

Similar Skills

github-deep-research

63.9k

Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.

2 files

bytedance-deer-flow-1

surprise-me

63.9k

Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.

bytedance-deer-flow-1

image-generation

63.9k

Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.

2 files

bytedance-deer-flow-1

Stats

Stars145

Forks44

Last CommitMar 22, 2026

Used By2 plugins

Actions

View Source View Plugin View on GitHub View README

Use Local Whisper

Switches voice transcription from OpenAI's Whisper API to local whisper.cpp. Runs entirely on-device — no API key, no network, no cost.

Channel support: Currently WhatsApp only. The transcription module (src/transcription.ts) uses Baileys types for audio download. Other channels (Telegram, Discord, etc.) would need their own audio-download logic before this skill can serve them.

Note: The Homebrew package is whisper-cpp, but the CLI binary it installs is whisper-cli.

Prerequisites

voice-transcription skill must be applied first (WhatsApp channel)
macOS with Apple Silicon (M1+) recommended
whisper-cpp installed: brew install whisper-cpp (provides the whisper-cli binary)
ffmpeg installed: brew install ffmpeg
A GGML model file downloaded to data/models/

Phase 1: Pre-flight

Check if already applied

Check if src/transcription.ts already uses whisper-cli:

grep 'whisper-cli' src/transcription.ts && echo "Already applied" || echo "Not applied"

If already applied, skip to Phase 3 (Verify).

Check dependencies are installed

whisper-cli --help >/dev/null 2>&1 && echo "WHISPER_OK" || echo "WHISPER_MISSING"
ffmpeg -version >/dev/null 2>&1 && echo "FFMPEG_OK" || echo "FFMPEG_MISSING"

If missing, install via Homebrew:

brew install whisper-cpp ffmpeg

Check for model file

ls data/models/ggml-*.bin 2>/dev/null || echo "NO_MODEL"

If no model exists, download the base model (148MB, good balance of speed and accuracy):

mkdir -p data/models
curl -L -o data/models/ggml-base.bin "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin"

For better accuracy at the cost of speed, use ggml-small.bin (466MB) or ggml-medium.bin (1.5GB).

Phase 2: Apply Code Changes

Ensure WhatsApp fork remote

git remote -v

If whatsapp is missing, add it:

git remote add whatsapp https://github.com/qwibitai/claudeclaw-whatsapp.git

Merge the skill branch

git fetch whatsapp skill/local-whisper
git merge whatsapp/skill/local-whisper || {
  git checkout --theirs package-lock.json
  git add package-lock.json
  git merge --continue
}

This modifies src/transcription.ts to use the whisper-cli binary instead of the OpenAI API.

Validate

npm run build

Phase 3: Verify

Ensure launchd PATH includes Homebrew

The ClaudeClaw launchd service runs with a restricted PATH. whisper-cli and ffmpeg are in /opt/homebrew/bin/ (Apple Silicon) or /usr/local/bin/ (Intel), which may not be in the plist's PATH.

Service name: Derived from the directory name: com.claudeclaw.<dirname> (macOS) / claudeclaw-<dirname> (Linux). For example, if cwd is my-assistant, the service is com.claudeclaw.my-assistant. Determine the correct service name before running service commands below.

Check the current PATH:

grep -A1 'PATH' ~/Library/LaunchAgents/com.claudeclaw.plist

If /opt/homebrew/bin is missing, add it to the <string> value inside the PATH key in the plist. Then reload:

launchctl unload ~/Library/LaunchAgents/com.claudeclaw.plist
launchctl load ~/Library/LaunchAgents/com.claudeclaw.plist

Build and restart

npm run build
launchctl kickstart -k gui/$(id -u)/com.claudeclaw

Test

Send a voice note in any registered group. The agent should receive it as [Voice: <transcript>].

Check logs

tail -f logs/claudeclaw.log | grep -i -E "voice|transcri|whisper"

Look for:

Transcribed voice message — successful transcription
whisper.cpp transcription failed — check model path, ffmpeg, or PATH

Configuration

Environment variables (optional, set in .env):

Variable	Default	Description
`WHISPER_BIN`	`whisper-cli`	Path to whisper.cpp binary
`WHISPER_MODEL`	`data/models/ggml-base.bin`	Path to GGML model file

Troubleshooting

"whisper.cpp transcription failed": Ensure both whisper-cli and ffmpeg are in PATH. The launchd service uses a restricted PATH — see Phase 3 above. Test manually:

ffmpeg -f lavfi -i anullsrc=r=16000:cl=mono -t 1 -f wav /tmp/test.wav -y
whisper-cli -m data/models/ggml-base.bin -f /tmp/test.wav --no-timestamps -nt

Transcription works in dev but not as service: The launchd plist PATH likely doesn't include /opt/homebrew/bin. See "Ensure launchd PATH includes Homebrew" in Phase 3.

Slow transcription: The base model processes ~30s of audio in <1s on M1+. If slower, check CPU usage — another process may be competing.

Wrong language: whisper.cpp auto-detects language. To force a language, you can set WHISPER_LANG and modify src/transcription.ts to pass -l $WHISPER_LANG.