From elevenlabs-pack
Deploys ElevenLabs TTS apps to Vercel, Fly.io, and Cloud Run with secrets, timeouts, Next.js API examples, and configs for serverless or container hosting.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin elevenlabs-packThis skill is limited to using the following tools:
Deploy ElevenLabs TTS/voice applications to cloud platforms. Covers Vercel (serverless), Fly.io (containers), and Google Cloud Run with proper secrets management, timeout configuration, and streaming support.
Implements production reference architecture for ElevenLabs TTS/voice apps with TypeScript project structure, service layers, caching, API routes, queues, and monitoring.
Build and troubleshoot ElevenLabs TTS integrations in Node/Python/web apps: auth, voice/model selection, streaming vs batch generation, latency, fallbacks, secure API keys.
Deploys AssemblyAI transcription services to Vercel, Cloud Run, and Fly.io with secrets management and webhook setup for production.
Share bugs, ideas, or general feedback.
Deploy ElevenLabs TTS/voice applications to cloud platforms. Covers Vercel (serverless), Fly.io (containers), and Google Cloud Run with proper secrets management, timeout configuration, and streaming support.
vercel, fly, or gcloud)Key constraint: Vercel functions have a 10-second timeout on Hobby (30s on Pro). Use Flash model for speed.
# Set secrets
vercel env add ELEVENLABS_API_KEY production
vercel env add ELEVENLABS_API_KEY preview
# Deploy
vercel --prod
API Route (Next.js / Vercel):
// app/api/tts/route.ts
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { NextResponse } from "next/server";
export const runtime = "nodejs";
export const maxDuration = 30; // Vercel Pro max
const client = new ElevenLabsClient();
export async function POST(req: Request) {
const { text, voiceId = "21m00Tcm4TlvDq8ikWAM" } = await req.json();
if (!text || text.length > 5000) {
return NextResponse.json(
{ error: "Text required, max 5000 characters" },
{ status: 400 }
);
}
try {
const audio = await client.textToSpeech.convert(voiceId, {
text,
model_id: "eleven_flash_v2_5", // Fast for serverless
output_format: "mp3_22050_32",
voice_settings: {
stability: 0.5,
similarity_boost: 0.75,
},
});
return new Response(audio as any, {
headers: {
"Content-Type": "audio/mpeg",
"Cache-Control": "public, max-age=3600",
},
});
} catch (error: any) {
const status = error.statusCode || 500;
return NextResponse.json(
{ error: error.message || "TTS generation failed" },
{ status }
);
}
}
vercel.json:
{
"env": {
"ELEVENLABS_API_KEY": "@elevenlabs_api_key"
},
"functions": {
"app/api/tts/route.ts": {
"maxDuration": 30
}
}
}
Better for long-running TTS, WebSocket streaming, and high concurrency.
fly.toml:
app = "my-tts-service"
primary_region = "iad"
[env]
NODE_ENV = "production"
# Use the closest region to ElevenLabs servers (US East)
ELEVENLABS_MODEL = "eleven_multilingual_v2"
[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 1
[http_service.concurrency]
type = "requests"
hard_limit = 25
soft_limit = 20
[[vm]]
cpu_kind = "shared"
cpus = 1
memory_mb = 512
# Set secrets
fly secrets set ELEVENLABS_API_KEY=sk_your_prod_key
fly secrets set ELEVENLABS_WEBHOOK_SECRET=whsec_your_secret
# Deploy
fly deploy
# Check logs
fly logs
Express server with streaming:
// server.ts
import express from "express";
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { Readable } from "stream";
const app = express();
app.use(express.json());
const client = new ElevenLabsClient();
// Streaming TTS endpoint
app.post("/api/tts/stream", async (req, res) => {
const { text, voiceId = "21m00Tcm4TlvDq8ikWAM", modelId } = req.body;
res.setHeader("Content-Type", "audio/mpeg");
res.setHeader("Transfer-Encoding", "chunked");
try {
const stream = await client.textToSpeech.stream(voiceId, {
text,
model_id: modelId || "eleven_flash_v2_5",
output_format: "mp3_22050_32",
});
// Pipe streaming audio directly to response
const readable = Readable.fromWeb(stream as any);
readable.pipe(res);
} catch (error: any) {
if (!res.headersSent) {
res.status(error.statusCode || 500).json({ error: error.message });
}
}
});
// Health check
app.get("/health", async (_req, res) => {
try {
const user = await client.user.get();
res.json({
status: "healthy",
quota: {
used: user.subscription.character_count,
limit: user.subscription.character_limit,
},
});
} catch {
res.status(503).json({ status: "unhealthy" });
}
});
app.listen(3000, () => console.log("TTS service running on :3000"));
# Build and deploy
gcloud run deploy tts-service \
--source . \
--region us-central1 \
--platform managed \
--allow-unauthenticated \
--set-secrets=ELEVENLABS_API_KEY=elevenlabs-api-key:latest \
--timeout=60 \
--concurrency=10 \
--min-instances=0 \
--max-instances=5
# Store secret in Secret Manager first
echo -n "sk_your_prod_key" | gcloud secrets create elevenlabs-api-key --data-file=-
Dockerfile:
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
CMD ["node", "dist/server.js"]
| Feature | Vercel | Fly.io | Cloud Run |
|---|---|---|---|
| Max timeout | 30s (Pro) | No limit | 60min |
| WebSocket streaming | Limited | Full support | Full support |
| Cold start | ~1-3s | ~0.5-2s | ~1-5s |
| Concurrency | Per-function | Per-VM | Per-instance |
| Best for | Simple TTS API | Streaming/WebSocket | Variable load |
| Min cost | Free tier | ~$2/mo | Free tier |
| Issue | Cause | Solution |
|---|---|---|
| Vercel timeout | TTS > 10s on Hobby | Upgrade to Pro (30s) or use Flash model |
| Cold start slow | Container initialization | Set min_instances=1 (Cloud Run) or min_machines=1 (Fly) |
| Secret not found | Missing platform config | Add via platform CLI |
| Streaming broken | Proxy buffering | Disable response buffering in nginx/CDN |
| CORS errors | Missing headers | Add Access-Control-Allow-Origin to TTS endpoint |
For webhook handling, see elevenlabs-webhooks-events.