From sundial-org-awesome-openclaw-skills-4
Enables real-time voice conversations in Discord voice channels with Claude AI using STT (Whisper/Deepgram), TTS (OpenAI/ElevenLabs), VAD, barge-in, and auto-reconnect. Ideal for interactive voice bots.
npx claudepluginhub joshuarweaver/cascade-ai-ml-agents-misc-2 --plugin sundial-org-awesome-openclaw-skills-4This skill uses the workspace's default tool permissions.
Real-time voice conversations in Discord voice channels. Join a voice channel, speak, and have your words transcribed, processed by Claude, and spoken back.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
Real-time voice conversations in Discord voice channels. Join a voice channel, speak, and have your words transcribed, processed by Claude, and spoken back.
ffmpeg (audio processing)@discordjs/opus and sodium-native# Ubuntu/Debian
sudo apt-get install ffmpeg build-essential python3
# Fedora/RHEL
sudo dnf install ffmpeg gcc-c++ make python3
# macOS
brew install ffmpeg
clawdhub install discord-voice
Or manually:
cd ~/.clawdbot/extensions
git clone <repository-url> discord-voice
cd discord-voice
npm install
{
"plugins": {
"entries": {
"discord-voice": {
"enabled": true,
"config": {
"sttProvider": "whisper",
"ttsProvider": "openai",
"ttsVoice": "nova",
"vadSensitivity": "medium",
"allowedUsers": [], // Empty = allow all users
"silenceThresholdMs": 1500,
"maxRecordingMs": 30000,
"openai": {
"apiKey": "sk-..." // Or use OPENAI_API_KEY env var
}
}
}
}
}
}
Ensure your Discord bot has these permissions:
Add these to your bot's OAuth2 URL or configure in Discord Developer Portal.
| Option | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable/disable the plugin |
sttProvider | string | "whisper" | "whisper" or "deepgram" |
streamingSTT | boolean | true | Use streaming STT (Deepgram only, ~1s faster) |
ttsProvider | string | "openai" | "openai" or "elevenlabs" |
ttsVoice | string | "nova" | Voice ID for TTS |
vadSensitivity | string | "medium" | "low", "medium", or "high" |
bargeIn | boolean | true | Stop speaking when user talks |
allowedUsers | string[] | [] | User IDs allowed (empty = all) |
silenceThresholdMs | number | 1500 | Silence before processing (ms) |
maxRecordingMs | number | 30000 | Max recording length (ms) |
heartbeatIntervalMs | number | 30000 | Connection health check interval |
autoJoinChannel | string | undefined | Channel ID to auto-join on startup |
{
"openai": {
"apiKey": "sk-...",
"whisperModel": "whisper-1",
"ttsModel": "tts-1"
}
}
{
"elevenlabs": {
"apiKey": "...",
"voiceId": "21m00Tcm4TlvDq8ikWAM", // Rachel
"modelId": "eleven_multilingual_v2"
}
}
{
"deepgram": {
"apiKey": "...",
"model": "nova-2"
}
}
Once registered with Discord, use these commands:
/voice join <channel> - Join a voice channel/voice leave - Leave the current voice channel/voice status - Show voice connection status# Join a voice channel
clawdbot voice join <channelId>
# Leave voice
clawdbot voice leave --guild <guildId>
# Check status
clawdbot voice status
The agent can use the discord_voice tool:
Join voice channel 1234567890
The tool supports actions:
join - Join a voice channel (requires channelId)leave - Leave voice channelspeak - Speak text in the voice channelstatus - Get current voice statusWhen using Deepgram as your STT provider, streaming mode is enabled by default. This provides:
To use streaming STT:
{
"sttProvider": "deepgram",
"streamingSTT": true, // default
"deepgram": {
"apiKey": "...",
"model": "nova-2"
}
}
When enabled (default), the bot will immediately stop speaking if a user starts talking. This creates a more natural conversational flow where you can interrupt the bot.
To disable (let the bot finish speaking):
{
"bargeIn": false
}
The plugin includes automatic connection health monitoring:
If the connection drops, you'll see logs like:
[discord-voice] Disconnected from voice channel
[discord-voice] Reconnection attempt 1/3
[discord-voice] Reconnected successfully
Ensure the Discord channel is configured and the bot is connected before using voice.
Install build tools:
npm install -g node-gyp
npm rebuild @discordjs/opus sodium-native
DEBUG=discord-voice clawdbot gateway start
| Variable | Description |
|---|---|
DISCORD_TOKEN | Discord bot token (required) |
OPENAI_API_KEY | OpenAI API key (Whisper + TTS) |
ELEVENLABS_API_KEY | ElevenLabs API key |
DEEPGRAM_API_KEY | Deepgram API key |
MIT