From startup
Generates SRT/VTT subtitles and plain text transcripts from video or audio files using AWS Transcribe and ffmpeg. Useful for captions, extracting speech, notes, or searchable content.
npx claudepluginhub rameerez/claude-code-startup-skillsThis skill is limited to using the following tools:
Generate subtitles and transcripts from `$ARGUMENTS` (a video or audio file path, optionally followed by a language code like `en-US` or `es-ES`) using AWS Transcribe.
Downloads YouTube/other videos via yt-dlp, extracts/converts audio via FFmpeg (mp4, webm, wav), transcribes with OpenAI Whisper via CLI script.
Transcribes audio from video/audio recordings using Whisper and extracts frames with FFmpeg. Auto-detects hardware (GPU/CPU) for optimal model on macOS, Linux, Windows.
Generates subtitles, captions, and transcriptions for videos using each::sense AI. Supports multi-language, animations, SRT/VTT exports, speaker diarization, and burned-in subtitles.
Share bugs, ideas, or general feedback.
Generate subtitles and transcripts from $ARGUMENTS (a video or audio file path, optionally followed by a language code like en-US or es-ES) using AWS Transcribe.
Outputs .srt, .vtt, and .txt files next to the source file.
ffmpeg and aws CLI are installed and configuredffmpeg installed (brew install ffmpeg)aws CLI installed and configured with valid credentials (brew install awscli && aws configure)s3:* (create/delete buckets), transcribe:* (start/delete jobs)ffmpeg -i "input.mp4" -vn -acodec mp3 -q:a 2 "/tmp/transcribe-audio.mp3" -y
BUCKET="tmp-transcribe-$(date +%s)"
aws s3 mb "s3://$BUCKET" --region us-east-1
aws s3 cp "/tmp/transcribe-audio.mp3" "s3://$BUCKET/audio.mp3"
JOB_NAME="tmp-job-$(date +%s)"
aws transcribe start-transcription-job \
--transcription-job-name "$JOB_NAME" \
--language-code en-US \
--media-format mp3 \
--media "MediaFileUri=s3://$BUCKET/audio.mp3" \
--subtitles "Formats=srt,vtt" \
--output-bucket-name "$BUCKET" \
--region us-east-1
Language codes: en-US, es-ES, fr-FR, de-DE, pt-BR, ja-JP, zh-CN, it-IT, ko-KR, etc. Default to en-US if not specified.
while true; do
STATUS=$(aws transcribe get-transcription-job \
--transcription-job-name "$JOB_NAME" \
--region us-east-1 \
--query 'TranscriptionJob.TranscriptionJobStatus' \
--output text)
if [ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ]; then break; fi
sleep 5
done
Save .srt and .vtt next to the original file:
aws s3 cp "s3://$BUCKET/$JOB_NAME.srt" "/path/to/input.srt"
aws s3 cp "s3://$BUCKET/$JOB_NAME.vtt" "/path/to/input.vtt"
Download the JSON result and extract the full transcript text:
aws s3 cp "s3://$BUCKET/$JOB_NAME.json" "/tmp/transcribe-result.json"
Then use a tool to extract the .results.transcripts[0].transcript field from the JSON and save it as a .txt file next to the original.
IMPORTANT: Always clean up to avoid recurring S3 storage costs.
# Delete S3 bucket and all contents
aws s3 rb "s3://$BUCKET" --force --region us-east-1
# Delete the transcription job
aws transcribe delete-transcription-job --transcription-job-name "$JOB_NAME" --region us-east-1
# Delete temp audio file
rm -f "/tmp/transcribe-audio.mp3" "/tmp/transcribe-result.json"
From actual transcription runs:
| Video | Duration | Audio Size | Transcribe Time | Subtitle Segments |
|---|---|---|---|---|
| X/Twitter clip | 2:40 | 2.5 MB | ~20 seconds | 83 |
| Screen recording | 18:45 | 11.4 MB | ~60 seconds | 500+ |
original-video.mp4
original-video.srt # Subtitles with timestamps (most compatible)
original-video.vtt # Web-optimized subtitles (for HTML5 <track>)
original-video.txt # Plain text transcript (no timestamps)
ls -lh /path/to/original-video.{srt,vtt,txt}