Help us improve
Share bugs, ideas, or general feedback.
From ava
Voice interaction for AI coding assistants. Provides natural voice conversations using ElevenLabs TTS and STT. Use when users mention ava, speak, talk, converse, voice status, or voice troubleshooting. ElevenLabs-only: eleven_v3 TTS model, Scribe v2 Realtime STT with local Silero VAD.
npx claudepluginhub harshav167/ava --plugin avaHow this skill is triggered — by the user, by Claude, or both
Slash command
/ava:avaThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Natural voice conversations with AI coding assistants using ElevenLabs text-to-speech (TTS) and speech-to-text (STT).
Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Structures git workflow practices for committing, branching, resolving conflicts, and organizing work across parallel streams. Use when making any code change.
Share bugs, ideas, or general feedback.
Natural voice conversations with AI coding assistants using ElevenLabs text-to-speech (TTS) and speech-to-text (STT).
Ava aims to create a Jarvis-like voice assistant experience. The AI speaks to you and listens, like a real conversation. In voice-primary mode, substantive responses must go through converse; if voice fails, stop and restore MCP instead of continuing chat-only.
Ava runs as an HTTP server on port 8765. Add this MCP configuration to Cursor, Claude Code, Factory, or another MCP-capable host:
{
"mcpServers": {
"ava": {
"type": "http",
"url": "http://127.0.0.1:8765/mcp"
}
}
}
Set your ElevenLabs API key:
# In ~/.ava/ava.env
ELEVENLABS_API_KEY=your-key-here
ElevenLabs provides:
eleven_v3 model with Donna voice (cloned)Use the converse MCP tool. Trust server defaults unless changing behavior for the current turn.
Do not bypass the native MCP tool with curl, raw HTTP requests, or direct /mcp JSON-RPC calls when the host already exposes Ava as an MCP tool. If Cursor, Claude Code, Factory, or another MCP-capable host has surfaced converse, use that native tool path rather than manually posting to http://127.0.0.1:8765/mcp.
# Speak and listen — trust server defaults
converse(message="Hello! What would you like to work on?", wait_for_conch=true)
# Speak without waiting (narration while working)
converse(message="Searching the codebase now...", wait_for_response=false, wait_for_conch=true)
# User wants to say something long without silence cutoff
converse(message="Go ahead, I'm listening.", disable_silence_detection=true, wait_for_conch=true)
The server defaults are tuned and locked. Do not pass tuning parameters — passing them yourself is the #1 cause of inconsistent behavior across agents. The vast majority of calls are exactly:
converse(message="...")
Only three parameters are ever situational: wait_for_response, disable_silence_detection, and (Claude Desktop only) skip_tts. Everything else is a server default you must NOT send.
| Parameter | Default | Pass it? | When to pass |
|---|---|---|---|
message | required | always | The text to speak (use "" only with skip_tts=true) |
wait_for_response | true | only to disable | Pass false for a fire-and-forget announcement/narration where you will NOT listen |
disable_silence_detection | false | only to enable | Pass true when the user explicitly needs to talk at length with long pauses (dictation, reading aloud) |
wait_for_conch | true | never (already default) | Leave as default; it auto-queues behind another speaker |
speed | 1.2 | never unless user asks | Only if the user says "talk faster/slower" mid-session. Range 0.7–1.2; eleven_v3 rejects >1.2 |
vad_aggressiveness | 2 | never unless troubleshooting | Server-tuned. Only change if the user reports background bleed (raise) or being cut off (lower) |
listen_duration_min | 1 | never | Server-tuned |
listen_duration_max | 600 | never | Server-tuned (10 min) |
timeout | 900 | never | Server-tuned (15 min) |
metrics_level | summary | never | Server-tuned |
Hard rule: if you find yourself passing speed, vad_aggressiveness, listen_duration_*, timeout, or metrics_level, stop — you almost certainly should not be. The user tunes those via ~/.ava/ava.env, not per-call.
converse; if voice fails, stop and restore MCP instead of continuing chat-onlywait_for_response=false only for short acknowledgements before workconverse(..., wait_for_response=false) acknowledgement with other tools in one turn for zero dead airdisable_silence_detection=true when the user needs to speak at length; server defaults already allow a 10-minute listen windowconverse call; normalize literal newlines out of message, but do not shorten or split substantive content merely to work around reliability issuesConfig file: ~/.ava/ava.env
| Variable | Default | Description |
|---|---|---|
ELEVENLABS_API_KEY | (none) | API key -- required |
AVA_ELEVENLABS_TTS_MODEL | eleven_v3 | TTS model |
AVA_ELEVENLABS_TTS_VOICE | k4hP4cQadSZQc0Oar2Ld | Voice ID (Donna) |
AVA_ELEVENLABS_STT_MODEL | scribe_v2_realtime | STT model |
AVA_ELEVENLABS_REALTIME_STT | true | Use realtime streaming STT |
AVA_SILENCE_THRESHOLD_MS | 2000 | Silence threshold in ms (2.0s default) |
AVA_VAD_AGGRESSIVENESS | 2 | VAD strictness (0-3); higher rejects more background audio |
http://127.0.0.1:8765/mcpscripts/ava-server.shconvert() + play() via ffplay# Via script (manages launchd plist)
scripts/ava-server.sh setup # Create launchd plist + start
scripts/ava-server.sh start # Start server
scripts/ava-server.sh stop # Stop server
scripts/ava-server.sh restart # Restart server
scripts/ava-server.sh status # Check status
scripts/ava-server.sh logs # Tail server logs