From agent
Sets up and tests agent voice backends for TTS (sag/ElevenLabs, OpenAI, macOS say) and STT (whisper-cli, OpenAI Whisper). Dispatches on /agent:voice commands and phrases like 'speak this' or 'transcribe audio'.
npx claudepluginhub crisandrews/clawcode --plugin agentThis skill uses the workspace's default tool permissions.
Set up and test the agent's voice backends. Voice is OPTIONAL — off by default. See `docs/voice.md` for the full reference including channel-plugin precedence.
Builds real-time voice AI applications and agents using OpenAI Realtime API, Vapi, Deepgram, ElevenLabs, LiveKit, and WebRTC. Optimizes latency for production voice experiences.
Builds ElevenLabs conversational AI voice agents: configure via CLI/dashboard, add tools/knowledge, integrate React/React Native/Swift/JS SDKs, test/deploy. For voice AI, phone systems, or ElevenLabs errors.
Manages ElevenLabs conversational AI agents: list local/remote, create from templates, sync via pull/push, and deploy. Use for voice agent creation and configuration.
Share bugs, ideas, or general feedback.
Set up and test the agent's voice backends. Voice is OPTIONAL — off by default. See docs/voice.md for the full reference including channel-plugin precedence.
| User says | Action |
|---|---|
/agent:voice (no arg) or /agent:voice status | Call voice_status and print the card |
/agent:voice setup | Guided setup (see flow below) |
/agent:voice test | Call voice_speak({ text: "Hola, soy <name>. Esta es una prueba.", ... }) using the user's language, report the path |
The agent invokes voice_speak / voice_transcribe directly when it needs to produce or consume audio during regular conversation. This skill is only for setup and diagnostics.
voice_status({ format: "json" }) to see current state.brew install steipete/tap/sag. It's a small wrapper around ElevenLabs with good voice-prompting conventions."export OPENAI_API_KEY=... in your shell rc (~/.zshrc or ~/.bashrc). Then restart the agent."say: "Built in on macOS. Sounds robotic but zero setup. No action needed — will be used as fallback."ELEVENLABS_API_KEY is missing, instruct: "Get a key from https://elevenlabs.io. Add export ELEVENLABS_API_KEY=sk_... to your shell rc. Restart the agent."whisper-cli (brew install whisper-cpp, offline, free) or OpenAI Whisper API (same OPENAI_API_KEY).voice.enabled: true in your config. Run agent_config(action='set', key='voice.enabled', value='true') or edit agent-config.json directly."agent_config for them after they confirm.)sag skill is in an OpenClaw workspace (~/.openclaw/workspace*/skills/sag/), offer: "I see you have the sag skill in an OpenClaw workspace. Want me to install it into this agent? Run /agent:skill install <that path>."voice_status reports whatsapp.audioEnabled: true — "Your WhatsApp plugin already transcribes voice notes locally. For inbound WhatsApp audio you don't need voice_transcribe. Setting this up is for WebChat uploads, iMessage audio, outbound voice notes, etc."false — "Your WhatsApp plugin doesn't transcribe by default. Either turn that on with /whatsapp:configure audio (local Whisper, free), or use our voice_transcribe per message."voice_status to confirm voice is enabled AND a backend is available. If not, redirect to setup.voice_speak({ text: "<greeting in user's language>" })./tmp/... using backend X. Play it or attach it in a messaging channel."MEDIA:/tmp/... or a dedicated send_media tool).Never put API keys in agent-config.json. They are SECRETS and the config file may end up in a git repo. Use environment variables:
export ELEVENLABS_API_KEY=sk_...
export OPENAI_API_KEY=sk-...
Add them to ~/.zshrc / ~/.bashrc / ~/.config/fish/config.fish for persistence.
Non-secret settings (default backend, voice ID, output dir) go in agent-config.json.
A channel plugin that already transcribes audio (like the WhatsApp plugin with audio on) is authoritative for THAT channel. Do not call voice_transcribe on an audio file that arrived through such a plugin — you'd just be re-doing work the plugin already did, and you'd end up with two different transcriptions.
voice_transcribe is for:
/agent:voice status on CLI: full card./agent:voice setup: step-by-step, one instruction at a time. Don't dump all 7 steps — guide the user through what's missing for their situation.agent-config.json — they're env-only.brew install — always let the user do it.voice_transcribe if the plugin has audio-on.docs/voice.md — full doc (backends, precedence, secrets, troubleshooting)lib/voice.ts — routing + backendsskills/skill-manager/SKILL.md — for installing the sag skill from OpenClaw path