Skill

assemblyai-rate-limits

Implements exponential backoff with jitter, queue throttling, and concurrency limits for AssemblyAI transcription and streaming APIs. Use for 429 retry logic and throughput management.

Typescript

Javascript

ai-ml

api-development

npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin assemblyai-pack

Tool Access

This skill is limited to using the following tools:

ReadWriteEdit

Preview

Handle AssemblyAI rate limits with exponential backoff, queue-based throttling, and concurrency management. AssemblyAI auto-scales limits for paid users.

SKILL.md

Similar Skills

assemblyai-performance-tuning

1.9k

Optimizes AssemblyAI transcription performance using model selection, parallel batch processing with PQueue, caching, and latency benchmarks. For slow transcriptions, high latency, or batch workloads.

3 tools

assemblyai-pack

deepgram-rate-limits

1.9k

Implements concurrency queues with p-limit, stats, and backoff for Deepgram API to handle 429 rate limits and quotas.

1 file4 tools

deepgram-pack

speak-rate-limits

1.9k

Implements Speak API rate limiting in TypeScript with per-minute throttling, 429 retry backoff, and batch queuing. Use for integrations hitting assessment/conversation limits.

1 file6 tools

speak-pack

Stats

Parent Repo Stars1854

Parent Repo Forks248

Last CommitMar 22, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

AssemblyAI Rate Limits

Overview

Handle AssemblyAI rate limits with exponential backoff, queue-based throttling, and concurrency management. AssemblyAI auto-scales limits for paid users.

Prerequisites

assemblyai package installed
Understanding of async/await patterns

Rate Limit Tiers (Actual)

Async Transcription API

Endpoint	Free	Pay-as-you-go
`POST /v2/transcript`	5/min	Scales with usage
`GET /v2/transcript/:id`	No hard limit	No hard limit
`POST /v2/upload`	5/min	Scales with usage

Streaming (WebSocket)

Metric	Free	Pay-as-you-go
New streams/min	5	100 (auto-scales)
Concurrent streams	~5	Unlimited (auto-scales 10% every 60s at 70% usage)

LeMUR

Metric	Free	Paid
Requests/min	Limited	Scales with usage
Max audio input	100 hours per request	100 hours per request

Note: AssemblyAI auto-scales paid limits. At 70%+ utilization, the new session rate limit increases by 10% every 60 seconds with no ceiling cap.

Instructions

Step 1: Exponential Backoff with Jitter

import { AssemblyAI, type Transcript } from 'assemblyai';

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY!,
});

async function transcribeWithBackoff(
  audioUrl: string,
  options: Record<string, any> = {},
  config = { maxRetries: 5, baseDelayMs: 1000, maxDelayMs: 30000 }
): Promise<Transcript> {
  for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
    try {
      return await client.transcripts.transcribe({
        audio: audioUrl,
        ...options,
      });
    } catch (err: any) {
      if (attempt === config.maxRetries) throw err;

      const status = err.status ?? err.statusCode;
      // Only retry on 429 (rate limit) and 5xx (server errors)
      if (status && status !== 429 && (status < 500 || status >= 600)) throw err;

      const exponentialDelay = config.baseDelayMs * Math.pow(2, attempt);
      const jitter = Math.random() * config.baseDelayMs;
      const delay = Math.min(exponentialDelay + jitter, config.maxDelayMs);

      console.warn(`[${attempt + 1}/${config.maxRetries}] Retrying in ${delay.toFixed(0)}ms...`);
      await new Promise(r => setTimeout(r, delay));
    }
  }
  throw new Error('Unreachable');
}

Step 2: Queue-Based Concurrency Control

import PQueue from 'p-queue';

// Limit to N concurrent transcription jobs
const transcriptionQueue = new PQueue({
  concurrency: 5,           // Max 5 concurrent jobs
  interval: 60_000,         // Per minute window
  intervalCap: 50,           // Max 50 new jobs per minute
});

async function queuedTranscribe(audioUrl: string): Promise<Transcript> {
  return transcriptionQueue.add(() =>
    transcribeWithBackoff(audioUrl)
  );
}

// Process a batch of files
const audioUrls = [
  'https://example.com/audio1.mp3',
  'https://example.com/audio2.mp3',
  'https://example.com/audio3.mp3',
];

const results = await Promise.all(
  audioUrls.map(url => queuedTranscribe(url))
);

console.log(`Completed ${results.length} transcriptions`);
console.log(`Queue size: ${transcriptionQueue.size}, pending: ${transcriptionQueue.pending}`);

Step 3: Batch Processing with Progress

async function batchTranscribe(
  audioUrls: string[],
  onProgress?: (completed: number, total: number) => void
): Promise<Transcript[]> {
  const queue = new PQueue({ concurrency: 5 });
  const results: Transcript[] = [];
  let completed = 0;

  const promises = audioUrls.map(url =>
    queue.add(async () => {
      const transcript = await transcribeWithBackoff(url);
      completed++;
      onProgress?.(completed, audioUrls.length);
      return transcript;
    })
  );

  return Promise.all(promises);
}

// Usage
await batchTranscribe(
  urls,
  (done, total) => console.log(`Progress: ${done}/${total}`)
);

Step 4: Streaming Rate Limit Handling

async function connectStreamingWithRetry(maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const transcriber = client.streaming.createService({
        speech_model: 'nova-3',
        sample_rate: 16000,
      });

      transcriber.on('error', (error) => {
        console.error('Streaming error:', error);
      });

      await transcriber.connect();
      return transcriber;
    } catch (err: any) {
      if (attempt === maxRetries) throw err;

      // WebSocket code 4008 = session limit
      const delay = Math.pow(2, attempt) * 2000;
      console.warn(`Stream connect failed. Retrying in ${delay}ms...`);
      await new Promise(r => setTimeout(r, delay));
    }
  }
}

Output

Automatic retry with exponential backoff and jitter
Queue-based concurrency control with p-queue
Batch transcription with progress reporting
Streaming reconnection logic

Error Handling

Scenario	Status	Strategy
Rate limited (async)	429	Exponential backoff, honor `Retry-After` header
Server error	500-503	Retry with backoff
Session limit (streaming)	WS 4008	Wait and reconnect
Auth error	401	Do not retry, fix credentials
Invalid input	400	Do not retry, fix request

Resources

Next Steps

For security configuration, see assemblyai-security-basics.