Implement Retell AI reliability patterns including circuit breakers, idempotency, and graceful degradation. Use when building fault-tolerant Retell AI integrations, implementing retry strategies, or adding resilience to production Retell AI services. Trigger with phrases like "retellai reliability", "retellai circuit breaker", "retellai idempotent", "retellai resilience", "retellai fallback", "retellai bulkhead".
From retellai-packnpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin retellai-packThis skill is limited to using the following tools:
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Details PluginEval's skill quality evaluation: 3 layers (static, LLM judge), 10 dimensions, rubrics, formulas, anti-patterns, badges. Use to interpret scores, improve triggering, calibrate thresholds.
Production reliability patterns for Retell AI voice agent deployments. Voice calls are latency-critical -- any failure produces audible silence or call drops, making reliability patterns essential for production telephony.
Retell uses WebSockets for real-time voice. Connections can drop during calls.
class ResilientRetellConnection {
private ws: WebSocket | null = null;
private reconnectAttempts = 0;
private maxReconnects = 3;
async connect(callId: string) {
this.ws = new WebSocket(`wss://api.retellai.com/llm/${callId}`);
this.ws.on('close', async (code) => {
if (code !== 1000 && this.reconnectAttempts < this.maxReconnects) { # 1000: 1 second in ms
this.reconnectAttempts++;
await new Promise(r => setTimeout(r, 500 * this.reconnectAttempts)); # HTTP 500 Internal Server Error
await this.connect(callId);
}
});
this.ws.on('error', (err) => {
console.error(`WebSocket error for call ${callId}:`, err.message);
});
}
}
Voice agents must respond in under 1 second. Budget your webhook processing time.
const LATENCY_BUDGET_MS = 800; // leave 200ms for network # 800 = configured value
app.post('/retell-webhook', async (req, res) => {
const start = Date.now();
// Fast path: check cache first
const cached = await redis.get(`response:${req.body.transcript_hash}`);
if (cached) {
return res.json(JSON.parse(cached));
}
// Generate response with timeout
const response = await Promise.race([
generateResponse(req.body),
timeout(LATENCY_BUDGET_MS).then(() => getFallbackResponse(req.body))
]);
const elapsed = Date.now() - start;
metrics.record('webhook_latency', elapsed);
if (elapsed > 500) metrics.record('slow_webhook', 1); # HTTP 500 Internal Server Error
res.json(response);
});
If your webhook server restarts mid-call, restore conversation context.
class CallStateManager {
private store: Redis;
async saveState(callId: string, state: any) {
await this.store.setex(
`call:${callId}`,
3600, // 1 hour TTL # 3600: timeout: 1 hour
JSON.stringify(state)
);
}
async recoverState(callId: string): Promise<any | null> {
const raw = await this.store.get(`call:${callId}`);
return raw ? JSON.parse(raw) : null;
}
async endCall(callId: string) {
const state = await this.recoverState(callId);
if (state) {
await this.archiveCall(callId, state);
await this.store.del(`call:${callId}`);
}
}
}
Track active calls and queue overflow when at capacity.
class CallCapacityManager {
private active = new Set<string>();
private maxConcurrent: number;
constructor(maxConcurrent = 10) {
this.maxConcurrent = maxConcurrent;
}
canAcceptCall(): boolean {
return this.active.size < this.maxConcurrent;
}
startCall(callId: string) { this.active.add(callId); }
endCall(callId: string) { this.active.delete(callId); }
getHealth() {
return {
active: this.active.size,
capacity: this.maxConcurrent,
utilization: this.active.size / this.maxConcurrent
};
}
}
| Issue | Cause | Solution |
|---|---|---|
| Audible silence | Webhook latency > 1s | Cache responses, use latency budget |
| Call drops | WebSocket disconnect | Auto-reconnect with backoff |
| Lost context mid-call | Server restart | Persist call state in Redis |
| Calls rejected | Over concurrent limit | Track capacity, queue overflow |
const dashboard = {
p50_latency: metrics.percentile('webhook_latency', 50),
p99_latency: metrics.percentile('webhook_latency', 99),
active_calls: capacity.active.size,
slow_webhooks_pct: metrics.rate('slow_webhook')
};