Skill

elevenlabs-deploy-integration

Deploys ElevenLabs TTS apps to Vercel, Fly.io, and Cloud Run with secrets, timeouts, Next.js API examples, and configs for serverless or container hosting.

Vercel

Next.js

GCP

deployment

npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin elevenlabs-pack

Tool Access

This skill is limited to using the following tools:

ReadWriteEditBash(vercel:*)Bash(fly:*)Bash(gcloud:*)

Preview

Deploy ElevenLabs TTS/voice applications to cloud platforms. Covers Vercel (serverless), Fly.io (containers), and Google Cloud Run with proper secrets management, timeout configuration, and streaming support.

SKILL.md

Similar Skills

elevenlabs-reference-architecture

1.9k

Implements production reference architecture for ElevenLabs TTS/voice apps with TypeScript project structure, service layers, caching, API routes, queues, and monitoring.

2 tools

elevenlabs-pack

elevenlabs

Build and troubleshoot ElevenLabs TTS integrations in Node/Python/web apps: auth, voice/model selection, streaming vs batch generation, latency, fallbacks, secure API keys.

1 file

asset-generation

assemblyai-deploy-integration

1.9k

Deploys AssemblyAI transcription services to Vercel, Cloud Run, and Fly.io with secrets management and webhook setup for production.

6 tools

assemblyai-pack

Stats

Parent Repo Stars1854

Parent Repo Forks248

Last CommitMar 22, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

ElevenLabs Deploy Integration

Overview

Prerequisites

ElevenLabs API key for production
Platform CLI installed (vercel, fly, or gcloud)
Application code tested locally

Instructions

Vercel Deployment (Serverless)

Key constraint: Vercel functions have a 10-second timeout on Hobby (30s on Pro). Use Flash model for speed.

# Set secrets
vercel env add ELEVENLABS_API_KEY production
vercel env add ELEVENLABS_API_KEY preview

# Deploy
vercel --prod

API Route (Next.js / Vercel):

// app/api/tts/route.ts
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { NextResponse } from "next/server";

export const runtime = "nodejs";
export const maxDuration = 30; // Vercel Pro max

const client = new ElevenLabsClient();

export async function POST(req: Request) {
  const { text, voiceId = "21m00Tcm4TlvDq8ikWAM" } = await req.json();

  if (!text || text.length > 5000) {
    return NextResponse.json(
      { error: "Text required, max 5000 characters" },
      { status: 400 }
    );
  }

  try {
    const audio = await client.textToSpeech.convert(voiceId, {
      text,
      model_id: "eleven_flash_v2_5",  // Fast for serverless
      output_format: "mp3_22050_32",
      voice_settings: {
        stability: 0.5,
        similarity_boost: 0.75,
      },
    });

    return new Response(audio as any, {
      headers: {
        "Content-Type": "audio/mpeg",
        "Cache-Control": "public, max-age=3600",
      },
    });
  } catch (error: any) {
    const status = error.statusCode || 500;
    return NextResponse.json(
      { error: error.message || "TTS generation failed" },
      { status }
    );
  }
}

vercel.json:

{
  "env": {
    "ELEVENLABS_API_KEY": "@elevenlabs_api_key"
  },
  "functions": {
    "app/api/tts/route.ts": {
      "maxDuration": 30
    }
  }
}

Fly.io Deployment (Container)

Better for long-running TTS, WebSocket streaming, and high concurrency.

fly.toml:

app = "my-tts-service"
primary_region = "iad"

[env]
  NODE_ENV = "production"
  # Use the closest region to ElevenLabs servers (US East)
  ELEVENLABS_MODEL = "eleven_multilingual_v2"

[http_service]
  internal_port = 3000
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 1

  [http_service.concurrency]
    type = "requests"
    hard_limit = 25
    soft_limit = 20

[[vm]]
  cpu_kind = "shared"
  cpus = 1
  memory_mb = 512

# Set secrets
fly secrets set ELEVENLABS_API_KEY=sk_your_prod_key
fly secrets set ELEVENLABS_WEBHOOK_SECRET=whsec_your_secret

# Deploy
fly deploy

# Check logs
fly logs

Express server with streaming:

// server.ts
import express from "express";
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { Readable } from "stream";

const app = express();
app.use(express.json());

const client = new ElevenLabsClient();

// Streaming TTS endpoint
app.post("/api/tts/stream", async (req, res) => {
  const { text, voiceId = "21m00Tcm4TlvDq8ikWAM", modelId } = req.body;

  res.setHeader("Content-Type", "audio/mpeg");
  res.setHeader("Transfer-Encoding", "chunked");

  try {
    const stream = await client.textToSpeech.stream(voiceId, {
      text,
      model_id: modelId || "eleven_flash_v2_5",
      output_format: "mp3_22050_32",
    });

    // Pipe streaming audio directly to response
    const readable = Readable.fromWeb(stream as any);
    readable.pipe(res);
  } catch (error: any) {
    if (!res.headersSent) {
      res.status(error.statusCode || 500).json({ error: error.message });
    }
  }
});

// Health check
app.get("/health", async (_req, res) => {
  try {
    const user = await client.user.get();
    res.json({
      status: "healthy",
      quota: {
        used: user.subscription.character_count,
        limit: user.subscription.character_limit,
      },
    });
  } catch {
    res.status(503).json({ status: "unhealthy" });
  }
});

app.listen(3000, () => console.log("TTS service running on :3000"));

Google Cloud Run

# Build and deploy
gcloud run deploy tts-service \
  --source . \
  --region us-central1 \
  --platform managed \
  --allow-unauthenticated \
  --set-secrets=ELEVENLABS_API_KEY=elevenlabs-api-key:latest \
  --timeout=60 \
  --concurrency=10 \
  --min-instances=0 \
  --max-instances=5

# Store secret in Secret Manager first
echo -n "sk_your_prod_key" | gcloud secrets create elevenlabs-api-key --data-file=-

Dockerfile:

FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
CMD ["node", "dist/server.js"]

Platform Comparison for ElevenLabs

Feature	Vercel	Fly.io	Cloud Run
Max timeout	30s (Pro)	No limit	60min
WebSocket streaming	Limited	Full support	Full support
Cold start	~1-3s	~0.5-2s	~1-5s
Concurrency	Per-function	Per-VM	Per-instance
Best for	Simple TTS API	Streaming/WebSocket	Variable load
Min cost	Free tier	~$2/mo	Free tier

Error Handling

Issue	Cause	Solution
Vercel timeout	TTS > 10s on Hobby	Upgrade to Pro (30s) or use Flash model
Cold start slow	Container initialization	Set `min_instances=1` (Cloud Run) or `min_machines=1` (Fly)
Secret not found	Missing platform config	Add via platform CLI
Streaming broken	Proxy buffering	Disable response buffering in nginx/CDN
CORS errors	Missing headers	Add `Access-Control-Allow-Origin` to TTS endpoint

Resources

Next Steps

For webhook handling, see elevenlabs-webhooks-events.