Skill

transcribe-video

Generates SRT/VTT subtitles and plain text transcripts from video or audio files using AWS Transcribe and ffmpeg. Useful for captions, extracting speech, notes, or searchable content.

AWS

Bash

automation

cli-tools

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/startup:transcribe-video

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

Bash(ffmpeg:*)Bash(aws:*)Bash(ls:*)Bash(rm:*)Bash(which:*)

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Generate subtitles and transcripts from `$ARGUMENTS` (a video or audio file path, optionally followed by a language code like `en-US` or `es-ES`) using AWS Transcribe.

SKILL.md

140 lines · ~1.2k tokens

Stats

LanguagePython

Stars23

Forks1

MaintenanceExcellent

Last CommitFeb 23, 2026

Actions

View Source View Plugin View on GitHub View README

Video Transcription Skill

Generate subtitles and transcripts from $ARGUMENTS (a video or audio file path, optionally followed by a language code like en-US or es-ES) using AWS Transcribe.

Outputs .srt, .vtt, and .txt files next to the source file.

Process

Verify prerequisites - check ffmpeg and aws CLI are installed and configured
Extract audio from the video as MP3 using ffmpeg
Create temporary S3 bucket, upload audio
Run AWS Transcribe job with SRT and VTT subtitle output
Download results and generate plain text transcript
Clean up all AWS resources - delete S3 bucket, Transcribe job, and temp files. No recurring costs.

Prerequisites

ffmpeg installed (brew install ffmpeg)
aws CLI installed and configured with valid credentials (brew install awscli && aws configure)
AWS credentials need permissions for: s3:* (create/delete buckets), transcribe:* (start/delete jobs)

Step-by-Step

Step 1: Extract audio

ffmpeg -i "input.mp4" -vn -acodec mp3 -q:a 2 "/tmp/transcribe-audio.mp3" -y

Step 2: Create temp S3 bucket and upload

BUCKET="tmp-transcribe-$(date +%s)"
aws s3 mb "s3://$BUCKET" --region us-east-1
aws s3 cp "/tmp/transcribe-audio.mp3" "s3://$BUCKET/audio.mp3"

Step 3: Start transcription job

JOB_NAME="tmp-job-$(date +%s)"
aws transcribe start-transcription-job \
  --transcription-job-name "$JOB_NAME" \
  --language-code en-US \
  --media-format mp3 \
  --media "MediaFileUri=s3://$BUCKET/audio.mp3" \
  --subtitles "Formats=srt,vtt" \
  --output-bucket-name "$BUCKET" \
  --region us-east-1

Language codes: en-US, es-ES, fr-FR, de-DE, pt-BR, ja-JP, zh-CN, it-IT, ko-KR, etc. Default to en-US if not specified.

Step 4: Poll until complete

while true; do
  STATUS=$(aws transcribe get-transcription-job \
    --transcription-job-name "$JOB_NAME" \
    --region us-east-1 \
    --query 'TranscriptionJob.TranscriptionJobStatus' \
    --output text)
  if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then break; fi
  sleep 5
done

Step 5: Download subtitle files

Save .srt and .vtt next to the original file:

aws s3 cp "s3://$BUCKET/$JOB_NAME.srt" "/path/to/input.srt"
aws s3 cp "s3://$BUCKET/$JOB_NAME.vtt" "/path/to/input.vtt"

Step 6: Generate plain text transcript

Download the JSON result and extract the full transcript text:

aws s3 cp "s3://$BUCKET/$JOB_NAME.json" "/tmp/transcribe-result.json"

Then use a tool to extract the .results.transcripts[0].transcript field from the JSON and save it as a .txt file next to the original.

Step 7: Clean up everything

IMPORTANT: Always clean up to avoid recurring S3 storage costs.

# Delete S3 bucket and all contents
aws s3 rb "s3://$BUCKET" --force --region us-east-1

# Delete the transcription job
aws transcribe delete-transcription-job --transcription-job-name "$JOB_NAME" --region us-east-1

# Delete temp audio file
rm -f "/tmp/transcribe-audio.mp3" "/tmp/transcribe-result.json"

Real-World Results (Reference)

From actual transcription runs:

Video	Duration	Audio Size	Transcribe Time	Subtitle Segments
X/Twitter clip	2:40	2.5 MB	~20 seconds	83
Screen recording	18:45	11.4 MB	~60 seconds	500+

Key Insights

AWS Transcribe is fast - even 19-minute videos complete in about a minute
Short-form content (tweets, reels) transcribes almost instantly
Cost is negligible - AWS Transcribe charges ~$0.024/min, so a 19-min video costs ~$0.46
Cleanup is critical - always delete the S3 bucket to avoid storage charges
SRT is most compatible - works with most video players and editors; VTT is better for web

Output Files

original-video.mp4
original-video.srt          # Subtitles with timestamps (most compatible)
original-video.vtt          # Web-optimized subtitles (for HTML5 <track>)
original-video.txt          # Plain text transcript (no timestamps)

After Transcription

Verify all output files exist: ls -lh /path/to/original-video.{srt,vtt,txt}
Report the number of subtitle segments and total duration
Confirm all AWS resources have been cleaned up (no S3 buckets, no Transcribe jobs remaining)

transcribe-video

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

transcribe-video

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Video Transcription Skill

Process

Prerequisites

Step-by-Step

Step 1: Extract audio

Step 2: Create temp S3 bucket and upload

Step 3: Start transcription job

Step 4: Poll until complete

Step 5: Download subtitle files

Step 6: Generate plain text transcript

Step 7: Clean up everything

Real-World Results (Reference)

Key Insights

Output Files

After Transcription

Similar Skills

Video Transcription Skill

Process

Prerequisites

Step-by-Step

Step 1: Extract audio

Step 2: Create temp S3 bucket and upload

Step 3: Start transcription job

Step 4: Poll until complete

Step 5: Download subtitle files

Step 6: Generate plain text transcript

Step 7: Clean up everything

Real-World Results (Reference)

Key Insights

Output Files

After Transcription

Similar Skills