From livekit-skills
Build voice AI agents with LiveKit Agents SDK using Cloud Inference or self-hosted setups and lk CLI. Includes checklists for credentials, docs, and testing.
npx claudepluginhub fcakyon/claude-codex-settings --plugin livekit-skillsThis skill uses the workspace's default tool permissions.
This skill provides guidance for building voice AI agents with the LiveKit Agents SDK. It covers both LiveKit Cloud and self-hosted deployments, using the `lk` CLI for documentation access and project management. All factual information about APIs, methods, and configurations must come from live documentation.
Guides building voice AI agents with LiveKit Cloud and Agents SDK, including project setup, LiveKit Inference integration, workflows, handoffs, and mandatory testing.
Builds real-time voice AI applications and agents using OpenAI Realtime API, Vapi, Deepgram, ElevenLabs, LiveKit, and WebRTC. Optimizes latency for production voice experiences.
Builds ElevenLabs conversational AI voice agents: configure via CLI/dashboard, add tools/knowledge, integrate React/React Native/Swift/JS SDKs, test/deploy. For voice AI, phone systems, or ElevenLabs errors.
Share bugs, ideas, or general feedback.
This skill provides guidance for building voice AI agents with the LiveKit Agents SDK. It covers both LiveKit Cloud and self-hosted deployments, using the lk CLI for documentation access and project management. All factual information about APIs, methods, and configurations must come from live documentation.
Before writing ANY code, complete this checklist:
LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRETlk CLI for lk docs commandsLiveKit Cloud is the fastest way to get a voice agent running. It provides:
Sign up at cloud.livekit.io if you haven't already
Create a project (or use an existing one)
Get your credentials from the project settings:
LIVEKIT_URL - Your project's WebSocket URL (e.g., wss://your-project.livekit.cloud)LIVEKIT_API_KEY - API key for authenticationLIVEKIT_API_SECRET - API secret for authenticationSet these as environment variables (typically in .env.local):
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your-api-key
LIVEKIT_API_SECRET=your-api-secret
The LiveKit CLI can automate credential setup. Consult the CLI documentation for current commands.
LiveKit Inference is one option for AI model access when using LiveKit Cloud. It provides access to leading AI model providers—all through your LiveKit credentials with no separate API keys needed.
Benefits of LiveKit Inference:
Consult the documentation for available models, supported providers, and current usage patterns. The documentation always has the most up-to-date information.
Self-hosting removes Cloud tier limits on deployments and concurrency. You control scaling directly.
Install and run the LiveKit server:
brew install livekitcurl -sSL https://get.livekit.io | bashStart in dev mode:
livekit-server --dev
Default credentials: API key devkey, API secret secret.
Set environment variables:
LIVEKIT_URL=ws://localhost:7880
LIVEKIT_API_KEY=devkey
LIVEKIT_API_SECRET=secret
Deploy livekit-server via Docker, Kubernetes, or VMs on any provider (Hetzner, AWS, GCP, etc.). Consult lk docs get-page /home/self-hosting or see references/self-hosting.md for details. Agent servers run as regular processes managed by your infra tooling.
When self-hosting or when you prefer your own API keys over LiveKit Inference, configure model providers directly via environment variables:
# STT (Speech-to-Text)
DEEPGRAM_API_KEY=your-key
# LLM
OPENAI_API_KEY=your-key
# TTS (Text-to-Speech)
ELEVEN_API_KEY=your-key
# or
CARTESIA_API_KEY=your-key
The Agents SDK has plugins for all major providers. Pass model identifiers directly:
Node.js / TypeScript:
import { voice } from "@livekit/agents";
const session = new voice.AgentSession({
stt: "deepgram/nova-3:multi",
llm: "openai/gpt-4.1-mini",
tts: "cartesia/sonic-3:voice-id", // or "elevenlabs/..."
});
Python:
session = AgentSession(
stt="deepgram/nova-3",
llm="openai/gpt-4.1-mini",
tts="elevenlabs/...", # or "cartesia/sonic-3:voice-id"
)
Consult lk docs search "plugins" for the full list of supported providers.
Initialize a new agent project with the CLI:
Backend agents:
lk agent init my-agent --template agent-starter-python
lk agent init my-agent --template agent-starter-node
Frontend apps (React/Next.js, React Native, Swift, Flutter, Android):
lk agent init my-frontend --template agent-starter-react
lk agent init my-frontend --template agent-starter-react-native
Omit --template to see all available templates interactively.
LiveKit Agents is a fast-evolving SDK. Model training data is outdated the moment it's created. When working with LiveKit:
This rule applies even when confident about an API. Verify anyway.
Before writing any LiveKit code, use the lk docs CLI commands for current, verified API information. This prevents reliance on stale model knowledge.
lk docs search "voice agent quickstart"
lk docs search "handoffs and tasks"
lk docs get-page /agents/start/voice-ai-quickstart
lk docs get-page /agents/build/tools /agents/build/vision
lk docs code-search "class AgentSession" --repo livekit/agents
lk docs code-search "@function_tool" --language Python --full-file
lk docs changelog livekit/agents
lk docs changelog pypi:livekit-agents --releases 5
lk docs changelog npm:livekit-agents --releases 5
Install the LiveKit CLI first:
brew install livekit-clicurl -sSL https://get.livekit.io/cli | bashwinget install LiveKit.LiveKitCLIAs a fallback, reference pages are available in the references/ directory alongside this skill.
Voice AI agents have fundamentally different requirements than text-based agents or traditional software. Internalize these principles:
Voice conversations are real-time. Users expect responses within hundreds of milliseconds, not seconds. Every architectural decision should consider latency impact:
Large system prompts and extensive tool lists directly increase latency. A voice agent with 50 tools and a 10,000-token system prompt will feel sluggish regardless of model speed.
Design agents with minimal viable context:
Voice interface constraints differ from text:
Complex voice agents should not be monolithic. LiveKit Agents supports structured workflows that maintain low latency while handling sophisticated use cases.
A single agent handling an entire conversation flow accumulates:
This creates latency and reduces reliability.
Handoffs allow one agent to transfer control to another. Use handoffs to:
Design handoffs around natural conversation boundaries where context can be summarized rather than transferred wholesale.
Tasks are tightly-scoped prompts designed to achieve a specific outcome. Use tasks for:
Consult the documentation for implementation details on handoffs and tasks.
Voice agent behavior is code. Every agent implementation MUST include tests. Shipping an agent without tests is shipping untested code.
When building or modifying a LiveKit agent:
tests/ directory if one doesn't existWhen modifying agent behavior—instructions, tool descriptions, workflows—begin by writing tests for the desired behavior:
This approach prevents shipping agents that "seem to work" but fail in production.
At minimum, write tests for:
Focus tests on:
Use LiveKit's testing framework. Consult the testing documentation via lk docs for current patterns:
search: "livekit agents testing"
The framework supports:
Agents that "seem to work" in manual testing frequently fail in production:
Tests catch these issues before users do.
If a user explicitly requests no tests, proceed without them but inform them:
"I've built the agent without tests as requested. I strongly recommend adding tests before deploying to production. Voice agents are difficult to verify manually and tests prevent silent regressions."
Starting with one agent that "does everything" and adding tools/instructions over time. Instead, design workflow structure upfront, even if initial implementation is simple.
Latency issues compound. An agent that feels "a bit slow" in development becomes unusable in production with real network conditions. Measure and optimize latency continuously.
Examples in documentation demonstrate specific patterns. Copying code without understanding its purpose leads to bloated, poorly-structured agents. Understand what each component does before including it.
Agent behavior is code. Prompt changes affect behavior as much as code changes. Test agent behavior with the same rigor as traditional software. Never deliver an agent implementation without at least one test file.
Reiterating the critical rule: never trust model memory for LiveKit APIs. The SDK evolves faster than model training cycles. Verify everything.
Always consult documentation for:
This skill provides guidance on:
The distinction matters: this skill tells you how to think about building voice agents. The documentation tells you how to implement specific features.
When using LiveKit documentation via lk docs, note any gaps, outdated information, or confusing content. Reporting documentation issues helps improve the ecosystem for all developers.
Building effective voice agents with LiveKit Cloud requires:
These principles remain valid regardless of SDK version or API changes. For all implementation specifics, consult the LiveKit documentation via lk docs.