Test voice agents from Retell, VAPI, LiveKit, Bland, and Telnyx using CLI commands: simulate LLM conversations, evaluate metrics with pass/fail reports, transcripts, and fix suggestions. Export definitions to Mermaid flowcharts, Python code, or other formats; convert between platforms; manage agent databases.
A generic test harness for voice agent workflows. Test agents from Retell, VAPI, LiveKit, Bland, Telnyx, and custom sources using a unified execution and evaluation model.
Installation
uv tool install voicetest
Or add to a project (use uv run voicetest to run):
uv add voicetest
Or with pip:
pip install voicetest
Quick Start
Try voicetest with a sample healthcare receptionist agent and tests:
# Set up an API key (free, no credit card at https://console.groq.com)
export GROQ_API_KEY=gsk_...
# Load demo and start interactive shell
voicetest demo
# Or load demo and start web UI
voicetest demo --serve
Tip: If you have Claude Code installed, you can skip API key setup entirely and use claudecode/sonnet as your model. See Claude Code Passthrough for details.
The demo includes a healthcare receptionist agent with 8 test cases covering appointment scheduling, identity verification, and more.
Interactive Shell
# Launch interactive shell (default)
uv run voicetest
# In the shell:
> agent tests/fixtures/retell/sample_config.json
> tests tests/fixtures/retell/sample_tests.json
> set agent_model ollama_chat/qwen2.5:0.5b
> run
CLI Commands
# List available importers
voicetest importers
# Run tests against an agent definition
voicetest run --agent agent.json --tests tests.json --all
# Export agent to different formats
voicetest export --agent agent.json --format mermaid # Diagram
voicetest export --agent agent.json --format livekit # Python code
voicetest export --agent agent.json --format retell-llm # Retell LLM JSON
voicetest export --agent agent.json --format retell-cf # Retell Conversation Flow JSON
voicetest export --agent agent.json --format vapi-assistant # VAPI Assistant JSON
voicetest export --agent agent.json --format vapi-squad # VAPI Squad JSON
voicetest export --agent agent.json --format bland # Bland AI JSON
voicetest export --agent agent.json --format telnyx # Telnyx AI JSON
voicetest export --agent agent.json --format voicetest # Voicetest JSON (.vt.json)
# Launch full TUI
voicetest tui --agent agent.json --tests tests.json
# Start REST API server with Web UI
voicetest serve
# Start infrastructure (LiveKit, Whisper, Kokoro) + backend for live calls
voicetest up
# Stop infrastructure services
voicetest down
Core Concepts
Agent Graphs
An agent is represented as an AgentGraph: a directed graph of nodes connected by transitions. Each node has a prompt, a type, and outgoing edges that control conversation flow. The graph has a single entry_node_id where every conversation starts.
Node Types
Type
LLM Call
Speech
Routing
Conversation
Yes
Yes
LLM picks a transition via prompt match, or falls back to an always edge
Logic
No
No
Evaluates equations top-to-bottom; first match wins
Extract
Yes (extraction)
No
LLM extracts variables from the conversation, then equations route
Any node type can also be a global node — reachable from any conversation node without explicit edges. See Global Nodes below.
Conversation nodes are the standard building block — they generate a spoken response and use LLM judgment (or an always edge) to choose the next node.
Logic nodes (also called branch nodes) have no prompt and produce no speech. All their transitions use equation or always conditions, evaluated deterministically without an LLM call.
Open-source testing and regression detection framework for AI agents. Golden baseline diffing, CI/CD integration, works with LangGraph, CrewAI, OpenAI, Anthropic Claude, HuggingFace, Ollama, and MCP.