From elevenlabs-pack
Executes production deployment checklist for ElevenLabs TTS/voice integrations: verifies API config, code quality, quotas, rate limits; provides TypeScript health check endpoint and rollback guidance.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin elevenlabs-packThis skill is limited to using the following tools:
Complete checklist for deploying ElevenLabs TTS/voice integrations to production. Covers API configuration, health checks, circuit breakers, monitoring, and rollback procedures.
Implements production reference architecture for ElevenLabs TTS/voice apps with TypeScript project structure, service layers, caching, API routes, queues, and monitoring.
Build and troubleshoot ElevenLabs TTS integrations in Node/Python/web apps: auth, voice/model selection, streaming vs batch generation, latency, fallbacks, secure API keys.
Executes Deepgram production checklist verifying auth, resilience, performance, monitoring, and security for integrations. Includes TypeScript singleton client and Express health check examples.
Share bugs, ideas, or general feedback.
Complete checklist for deploying ElevenLabs TTS/voice integrations to production. Covers API configuration, health checks, circuit breakers, monitoring, and rollback procedures.
Configuration:
ELEVENLABS_API_KEY set in deployment platform's secretseleven_multilingual_v2 or eleven_v3)Code Quality:
grep -r "sk_" src/)Quota Planning:
// src/api/health.ts
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
interface HealthStatus {
status: "healthy" | "degraded" | "unhealthy";
elevenlabs: {
connected: boolean;
latencyMs: number;
quotaRemaining: number | null;
quotaPctUsed: number | null;
};
timestamp: string;
}
export async function healthCheck(): Promise<HealthStatus> {
const client = new ElevenLabsClient();
const start = Date.now();
try {
const user = await client.user.get();
const latency = Date.now() - start;
const { character_count, character_limit } = user.subscription;
const remaining = character_limit - character_count;
const pctUsed = Math.round((character_count / character_limit) * 100);
return {
status: pctUsed > 90 ? "degraded" : "healthy",
elevenlabs: {
connected: true,
latencyMs: latency,
quotaRemaining: remaining,
quotaPctUsed: pctUsed,
},
timestamp: new Date().toISOString(),
};
} catch (error) {
return {
status: "unhealthy",
elevenlabs: {
connected: false,
latencyMs: Date.now() - start,
quotaRemaining: null,
quotaPctUsed: null,
},
timestamp: new Date().toISOString(),
};
}
}
// src/elevenlabs/circuit-breaker.ts
type CircuitState = "closed" | "open" | "half-open";
export class ElevenLabsCircuitBreaker {
private state: CircuitState = "closed";
private failures = 0;
private lastFailure = 0;
constructor(
private failureThreshold = 5, // Open after N consecutive failures
private resetTimeMs = 30_000, // Try again after 30s
) {}
async execute<T>(operation: () => Promise<T>, fallback?: () => T): Promise<T> {
if (this.state === "open") {
if (Date.now() - this.lastFailure > this.resetTimeMs) {
this.state = "half-open";
} else {
if (fallback) return fallback();
throw new Error("ElevenLabs circuit breaker is open — service unavailable");
}
}
try {
const result = await operation();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
if (fallback) return fallback();
throw error;
}
}
private onSuccess() {
this.failures = 0;
this.state = "closed";
}
private onFailure() {
this.failures++;
this.lastFailure = Date.now();
if (this.failures >= this.failureThreshold) {
this.state = "open";
console.error(`[ElevenLabs] Circuit breaker OPEN after ${this.failures} failures`);
}
}
getState(): CircuitState {
return this.state;
}
}
// Usage: graceful degradation when ElevenLabs is down
const breaker = new ElevenLabsCircuitBreaker();
async function generateSpeechWithFallback(text: string, voiceId: string) {
return breaker.execute(
() => client.textToSpeech.convert(voiceId, {
text,
model_id: "eleven_multilingual_v2",
}),
() => {
// Fallback: return pre-generated placeholder audio or null
console.warn("[ElevenLabs] Using fallback — TTS unavailable");
return null;
}
);
}
// src/elevenlabs/monitor.ts
interface TTSMetric {
operation: string;
voiceId: string;
modelId: string;
textLength: number;
latencyMs: number;
success: boolean;
errorCode?: string;
}
function emitMetric(metric: TTSMetric) {
// Send to your monitoring system (Datadog, CloudWatch, Prometheus, etc.)
console.log(JSON.stringify({
...metric,
timestamp: new Date().toISOString(),
service: "elevenlabs",
}));
}
// Alert thresholds
const ALERT_RULES = {
p99_latency_ms: 5000, // Alert if p99 > 5 seconds
error_rate_pct: 5, // Alert if error rate > 5%
quota_used_pct: 80, // Alert when 80% quota used
circuit_breaker_open: true, // Alert on circuit breaker trip
};
#!/bin/bash
# pre-flight-check.sh — Run before deploying
echo "=== ElevenLabs Pre-Flight Check ==="
# 1. API connectivity
HTTP=$(curl -s -o /dev/null -w "%{http_code}" \
https://api.elevenlabs.io/v1/user \
-H "xi-api-key: ${ELEVENLABS_API_KEY}")
echo "API connectivity: HTTP $HTTP"
[ "$HTTP" != "200" ] && echo "FAIL: API not reachable" && exit 1
# 2. Quota check
QUOTA=$(curl -s https://api.elevenlabs.io/v1/user \
-H "xi-api-key: ${ELEVENLABS_API_KEY}" | \
jq '.subscription | (.character_limit - .character_count)')
echo "Characters remaining: $QUOTA"
[ "$QUOTA" -lt 10000 ] && echo "WARN: Low quota"
# 3. Voice availability
VOICE_COUNT=$(curl -s https://api.elevenlabs.io/v1/voices \
-H "xi-api-key: ${ELEVENLABS_API_KEY}" | jq '.voices | length')
echo "Voices available: $VOICE_COUNT"
# 4. TTS smoke test
TTS_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
-X POST "https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM" \
-H "xi-api-key: ${ELEVENLABS_API_KEY}" \
-H "Content-Type: application/json" \
-d '{"text":"Pre-flight check.","model_id":"eleven_flash_v2_5"}')
echo "TTS smoke test: HTTP $TTS_STATUS"
[ "$TTS_STATUS" != "200" ] && echo "FAIL: TTS not working" && exit 1
echo "=== All checks passed ==="
| Alert | Condition | Severity |
|---|---|---|
| API unreachable | Health check fails 3x | P1 — Critical |
| Quota exhausted | 401 quota_exceeded | P1 — Critical |
| High error rate | 5xx > 5% of requests | P2 — High |
| Rate limited | 429 > 10/min sustained | P2 — High |
| High latency | p99 > 5000ms | P3 — Medium |
| Quota warning | > 80% used | P3 — Medium |
| Scenario | Response |
|---|---|
| ElevenLabs API down | Circuit breaker opens; fallback to cached/placeholder audio |
| Quota exhausted mid-day | Alert team; switch to Flash model (0.5x cost); queue non-urgent requests |
| Voice deleted | Return 404 to caller; alert; fall back to default voice |
| Webhook delivery failing | Monitor ElevenLabs webhook health; webhooks auto-disable after 10 failures |
For version upgrades, see elevenlabs-upgrade-migration. For cost optimization, see elevenlabs-cost-tuning.