From assemblyai-pack
Optimizes AssemblyAI transcription performance using model selection, parallel batch processing with PQueue, caching, and latency benchmarks. For slow transcriptions, high latency, or batch workloads.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin assemblyai-packThis skill is limited to using the following tools:
Optimize AssemblyAI transcription performance through model selection, parallel processing, caching, and webhook-based architectures.
Optimizes Deepgram transcription via ffmpeg preprocessing, model selection, streaming, parallel processing, and Redis caching for lower latency and higher throughput.
Implements exponential backoff with jitter, queue throttling, and concurrency limits for AssemblyAI transcription and streaming APIs. Use for 429 retry logic and throughput management.
Transcribes audio/video files locally to text using faster-whisper: 4-6x faster than OpenAI Whisper, GPU-accelerated for ~20x realtime, multilingual, with word-level timestamps for subtitles.
Share bugs, ideas, or general feedback.
Optimize AssemblyAI transcription performance through model selection, parallel processing, caching, and webhook-based architectures.
assemblyai package installed| Audio Duration | Approx. Processing Time | Notes |
|---|---|---|
| 30 seconds | ~10-15 seconds | Includes queue time |
| 5 minutes | ~30-60 seconds | Scales sub-linearly |
| 1 hour | ~3-5 minutes | Depends on queue load |
| 10 hours | ~15-30 minutes | Max async duration |
| Metric | Value |
|---|---|
| First partial transcript | ~300ms (P50) |
| Final transcript latency | ~500ms (P50) |
| End-of-turn detection | Automatic with endpointing |
| Model | Speed | Accuracy | Price/hr |
|---|---|---|---|
nano | Fastest | Good | $0.12 |
best (Universal-3) | Standard | Highest | $0.37 |
nova-3 (streaming) | Real-time | High | $0.47 |
nova-3-pro (streaming) | Real-time | Highest | $0.47 |
import { AssemblyAI } from 'assemblyai';
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY!,
});
// For highest accuracy (default)
const accurate = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'best',
});
// For fastest processing and lowest cost
const fast = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'nano',
});
import PQueue from 'p-queue';
const queue = new PQueue({ concurrency: 10 });
async function batchTranscribe(audioUrls: string[]) {
const results = await Promise.all(
audioUrls.map(url =>
queue.add(() =>
client.transcripts.transcribe({ audio: url, speech_model: 'nano' })
)
)
);
return results.filter(t => t.status === 'completed');
}
// Process 100 files with 10 concurrent jobs
const urls = Array.from({ length: 100 }, (_, i) => `https://storage.example.com/audio-${i}.mp3`);
const transcripts = await batchTranscribe(urls);
console.log(`Completed: ${transcripts.length}/${urls.length}`);
// SLOW: transcribe() polls every 3 seconds until done
const slow = await client.transcripts.transcribe({ audio: audioUrl });
// FAST: submit() returns immediately, webhook notifies on completion
const fast = await client.transcripts.submit({
audio: audioUrl,
webhook_url: 'https://your-app.com/webhooks/assemblyai',
});
// Your webhook handler processes the result — no polling overhead
import { LRUCache } from 'lru-cache';
import type { Transcript } from 'assemblyai';
const transcriptCache = new LRUCache<string, Transcript>({
max: 500,
ttl: 60 * 60 * 1000, // 1 hour
});
async function getCachedTranscript(transcriptId: string): Promise<Transcript> {
const cached = transcriptCache.get(transcriptId);
if (cached) return cached;
const transcript = await client.transcripts.get(transcriptId);
if (transcript.status === 'completed') {
transcriptCache.set(transcriptId, transcript);
}
return transcript;
}
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL!);
async function getCachedTranscriptRedis(transcriptId: string): Promise<Transcript> {
const cached = await redis.get(`transcript:${transcriptId}`);
if (cached) return JSON.parse(cached);
const transcript = await client.transcripts.get(transcriptId);
if (transcript.status === 'completed') {
await redis.setex(
`transcript:${transcriptId}`,
3600, // 1 hour TTL
JSON.stringify(transcript)
);
}
return transcript;
}
// Only enable features you actually need — each adds processing time
// Minimal (fastest)
const minimal = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'nano',
punctuate: true,
format_text: true,
});
// Full intelligence (slower, more expensive)
const full = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'best',
speaker_labels: true,
sentiment_analysis: true,
entity_detection: true,
auto_highlights: true,
content_safety: true,
iab_categories: true,
summarization: true,
summary_type: 'bullets',
});
async function timedTranscribe(audioUrl: string, options: Record<string, any> = {}) {
const start = Date.now();
const transcript = await client.transcripts.transcribe({
audio: audioUrl,
...options,
});
const durationMs = Date.now() - start;
const stats = {
transcriptId: transcript.id,
status: transcript.status,
audioDuration: transcript.audio_duration,
processingTimeMs: durationMs,
ratio: transcript.audio_duration
? (durationMs / 1000 / transcript.audio_duration).toFixed(2)
: 'N/A',
wordCount: transcript.words?.length ?? 0,
model: options.speech_model ?? 'best',
};
console.log('Transcription stats:', stats);
return { transcript, stats };
}
| Issue | Cause | Solution |
|---|---|---|
| Slow transcription | Large file + best model | Use nano model or split audio |
| Queue backlog | Too many concurrent submissions | Limit concurrency with p-queue |
| Cache stale data | Transcript re-processed | Set appropriate TTL, invalidate on webhook |
| Polling overhead | Using transcribe() for many files | Switch to submit() + webhooks |
For cost optimization, see assemblyai-cost-tuning.