From nanoclaw-skills
Adds automatic transcription of WhatsApp voice notes in NanoClaw using OpenAI Whisper API, delivering transcripts as [Voice: <transcript>] for agent responses.
npx claudepluginhub nanocoai/nanoclaw-skills --plugin nanoclaw-skillsThis skill uses the workspace's default tool permissions.
This skill adds automatic voice message transcription to NanoClaw's WhatsApp channel using OpenAI's Whisper API. When a voice note arrives, it is downloaded, transcribed, and delivered to the agent as `[Voice: <transcript>]`.
Adds OpenAI Whisper API transcription to ClaudeClaw WhatsApp channel. Downloads and transcribes voice notes into text as [Voice: <transcript>] for agent responses.
Installs OpenAI Whisper transcription plugin for NanoTars to convert voice notes from WhatsApp, Telegram, Discord channels into agent-readable text.
Switches voice transcription from OpenAI Whisper API to local whisper.cpp on Apple Silicon for WhatsApp channels. On-device, no network or API key needed.
Share bugs, ideas, or general feedback.
This skill adds automatic voice message transcription to NanoClaw's WhatsApp channel using OpenAI's Whisper API. When a voice note arrives, it is downloaded, transcribed, and delivered to the agent as [Voice: <transcript>].
Check if src/transcription.ts exists. If it does, skip to Phase 3 (Configure). The code changes are already in place.
Use AskUserQuestion to collect information:
AskUserQuestion: Do you have an OpenAI API key for Whisper transcription?
If yes, collect it now. If no, direct them to create one at https://platform.openai.com/api-keys.
Prerequisite: WhatsApp must be installed first (skill/whatsapp merged). This skill modifies WhatsApp channel files.
git remote -v
If whatsapp is missing, add it:
git remote add whatsapp https://github.com/qwibitai/nanoclaw-whatsapp.git
git fetch whatsapp skill/voice-transcription
git merge whatsapp/skill/voice-transcription || {
git checkout --theirs package-lock.json
git add package-lock.json
git merge --continue
}
This merges in:
src/transcription.ts (voice transcription module using OpenAI Whisper)src/channels/whatsapp.ts (isVoiceMessage check, transcribeAudioMessage call)src/channels/whatsapp.test.tsopenai npm dependency in package.jsonOPENAI_API_KEY in .env.exampleIf the merge reports conflicts, resolve them by reading the conflicted files and understanding the intent of both sides.
npm install --legacy-peer-deps
npm run build
npx vitest run src/channels/whatsapp.test.ts
All tests must pass and build must be clean before proceeding.
If the user doesn't have an API key:
I need you to create an OpenAI API key:
- Go to https://platform.openai.com/api-keys
- Click "Create new secret key"
- Give it a name (e.g., "NanoClaw Transcription")
- Copy the key (starts with
sk-)Cost:
$0.006 per minute of audio ($0.003 per typical 30-second voice note)
Wait for the user to provide the key.
Add to .env:
OPENAI_API_KEY=<their-key>
Sync to container environment:
mkdir -p data/env && cp .env data/env/env
The container reads environment from data/env/env, not .env directly.
npm run build
launchctl kickstart -k gui/$(id -u)/com.nanoclaw # macOS
# Linux: systemctl --user restart nanoclaw
Tell the user:
Send a voice note in any registered WhatsApp chat. The agent should receive it as
[Voice: <transcript>]and respond to its content.
tail -f logs/nanoclaw.log | grep -i voice
Look for:
Transcribed voice message — successful transcription with character countOPENAI_API_KEY not set — key missing from .envOpenAI transcription failed — API error (check key validity, billing)Failed to download audio message — media download issueOPENAI_API_KEY is set in .env AND synced to data/env/envcurl -s https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY" | head -c 200Check logs for the specific error. Common causes:
Verify the chat is registered and the agent is running. Voice transcription only runs for registered groups.