From assemblyai-pack
Optimizes AssemblyAI transcription costs with model selection, feature budgeting, usage monitoring, and TypeScript cost estimator. For billing analysis and budget alerts.
How this skill is triggered — by the user, by Claude, or both
Slash command
/assemblyai-pack:assemblyai-cost-tuningThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Optimize AssemblyAI costs through model selection, feature-aware billing, and usage monitoring. AssemblyAI charges per audio hour with add-on pricing for intelligence features.
Optimize AssemblyAI costs through model selection, feature-aware billing, and usage monitoring. AssemblyAI charges per audio hour with add-on pricing for intelligence features.
| Model | Price per Hour | Best For |
|---|---|---|
| Best (Universal-3) | $0.37/hr | Highest accuracy, production |
| Nano | $0.12/hr | High volume, cost-sensitive |
| Model | Price per Hour |
|---|---|
| Universal Streaming | $0.47/hr |
| Feature | Additional Cost per Hour |
|---|---|
| Speaker Diarization | $0.02/hr |
| Sentiment Analysis | $0.02/hr |
| Entity Detection | $0.08/hr |
| Auto Highlights | Included |
| Content Safety | $0.02/hr |
| IAB Categories | $0.02/hr |
| Summarization | Included (uses LeMUR) |
| PII Redaction | $0.02/hr |
| PII Audio Redaction | +processing time |
| Model | Price per Input Token | Price per Output Token |
|---|---|---|
| Default | ~$0.003/1K tokens | ~$0.015/1K tokens |
interface CostEstimate {
baseTranscriptionCost: number;
featuresCost: number;
totalCost: number;
breakdown: Record<string, number>;
}
function estimateTranscriptionCost(
audioHours: number,
options: {
model?: 'best' | 'nano';
speakerLabels?: boolean;
sentimentAnalysis?: boolean;
entityDetection?: boolean;
contentSafety?: boolean;
iabCategories?: boolean;
piiRedaction?: boolean;
} = {}
): CostEstimate {
const model = options.model ?? 'best';
const baseRate = model === 'best' ? 0.37 : 0.12;
const baseCost = audioHours * baseRate;
const breakdown: Record<string, number> = {
[`transcription (${model})`]: baseCost,
};
let featuresCost = 0;
if (options.speakerLabels) {
const cost = audioHours * 0.02;
breakdown['speaker_labels'] = cost;
featuresCost += cost;
}
if (options.sentimentAnalysis) {
const cost = audioHours * 0.02;
breakdown['sentiment_analysis'] = cost;
featuresCost += cost;
}
if (options.entityDetection) {
const cost = audioHours * 0.08;
breakdown['entity_detection'] = cost;
featuresCost += cost;
}
if (options.contentSafety) {
const cost = audioHours * 0.02;
breakdown['content_safety'] = cost;
featuresCost += cost;
}
if (options.iabCategories) {
const cost = audioHours * 0.02;
breakdown['iab_categories'] = cost;
featuresCost += cost;
}
if (options.piiRedaction) {
const cost = audioHours * 0.02;
breakdown['pii_redaction'] = cost;
featuresCost += cost;
}
return {
baseTranscriptionCost: baseCost,
featuresCost,
totalCost: baseCost + featuresCost,
breakdown,
};
}
// Example: 100 hours with Best model + diarization + sentiment
const estimate = estimateTranscriptionCost(100, {
model: 'best',
speakerLabels: true,
sentimentAnalysis: true,
});
// Result: $37 (transcription) + $2 (speakers) + $2 (sentiment) = $41
import { AssemblyAI } from 'assemblyai';
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY!,
});
// Use Nano for high-volume, cost-sensitive workloads
// - 3x cheaper than Best ($0.12 vs $0.37)
// - Good enough for search indexing, keyword detection
const cheapTranscript = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'nano',
});
// Use Best for critical, accuracy-sensitive workloads
// - Medical transcription, legal proceedings, compliance
// - Supports word_boost for domain terminology
const accurateTranscript = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'best',
word_boost: ['specialized', 'domain', 'terms'],
boost_param: 'high',
});
// EXPENSIVE: All features enabled ($0.37 + $0.16 = $0.53/hr)
const expensive = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'best', // $0.37/hr
speaker_labels: true, // +$0.02/hr
sentiment_analysis: true, // +$0.02/hr
entity_detection: true, // +$0.08/hr
content_safety: true, // +$0.02/hr
iab_categories: true, // +$0.02/hr
});
// CHEAP: Only what's needed ($0.12 + $0.02 = $0.14/hr)
const cheap = await client.transcripts.transcribe({
audio: audioUrl,
speech_model: 'nano', // $0.12/hr
speaker_labels: true, // +$0.02/hr
// Skip features you don't use
});
class AssemblyAIUsageTracker {
private totalAudioHours = 0;
private totalCost = 0;
private transcriptionCount = 0;
track(audioDurationSeconds: number, model: 'best' | 'nano', features: string[]) {
const hours = audioDurationSeconds / 3600;
this.totalAudioHours += hours;
this.transcriptionCount++;
const estimate = estimateTranscriptionCost(hours, {
model,
speakerLabels: features.includes('speaker_labels'),
sentimentAnalysis: features.includes('sentiment_analysis'),
entityDetection: features.includes('entity_detection'),
contentSafety: features.includes('content_safety'),
iabCategories: features.includes('iab_categories'),
piiRedaction: features.includes('redact_pii'),
});
this.totalCost += estimate.totalCost;
return estimate;
}
getSummary() {
return {
totalAudioHours: this.totalAudioHours.toFixed(2),
totalCost: `$${this.totalCost.toFixed(2)}`,
transcriptionCount: this.transcriptionCount,
avgCostPerTranscription: `$${(this.totalCost / this.transcriptionCount).toFixed(4)}`,
};
}
}
| Strategy | Savings | Trade-off |
|---|---|---|
| Use Nano instead of Best | 68% cheaper | Slightly lower accuracy |
| Disable unused features | Up to $0.16/hr | Missing insights |
| Cache transcript results | Eliminate re-fetch costs | Stale data risk |
| Use LeMUR instead of per-feature AI | Often cheaper for summaries | Different output format |
| Pre-filter audio (skip silence) | Proportional savings | Requires preprocessing |
| Batch with webhooks | No savings, but better throughput | More complex architecture |
const MONTHLY_BUDGET = 100; // $100
const tracker = new AssemblyAIUsageTracker();
// After each transcription
const estimate = tracker.track(transcript.audio_duration ?? 0, 'best', ['speaker_labels']);
const summary = tracker.getSummary();
if (parseFloat(summary.totalCost.replace('$', '')) > MONTHLY_BUDGET * 0.8) {
console.warn(`Budget warning: ${summary.totalCost} of $${MONTHLY_BUDGET} used`);
// Send alert to Slack, email, etc.
}
| Issue | Cause | Solution |
|---|---|---|
| Unexpected high bill | Entity detection enabled everywhere | Audit features per endpoint |
| Nano accuracy too low | Wrong model for use case | Switch critical paths to Best |
| Budget exceeded | No monitoring | Implement usage tracker + alerts |
| Double billing | Re-transcribing same audio | Cache transcript IDs, check before submitting |
For architecture patterns, see assemblyai-reference-architecture.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin assemblyai-packOptimizes Deepgram STT costs via model selection, budget checks in TypeScript, ffmpeg preprocessing, and usage monitoring. For reducing transcription expenses and controlling billing.
Implements AssemblyAI reference architecture for transcription services: layered design, webhooks, LeMUR pipelines, TypeScript/Node.js project structure.
Optimizes TwinMind SaaS costs across Free, Pro ($10/mo), and Enterprise tiers with usage monitoring and tier guidance. For managing meeting transcription AI operations.