Download YouTube video transcripts using yt-dlp. Use when user provides a YouTube URL and wants the transcript, captions, or subtitles.
Downloads YouTube video transcripts using yt-dlp and saves them to a local folder.
/plugin marketplace add jpoley/yt-summary/plugin install yt-summary@yt-summary-marketplaceyoutube/Download transcripts (subtitles/captions) from YouTube videos using yt-dlp.
$ARGUMENTS
IMPORTANT: Always save transcripts to ./yt-transcripts/ subfolder in the current working directory.
mkdir -p ./yt-transcripts
All transcript files should be saved to this folder, not the current directory.
mkdir -p ./yt-transcripts--write-sub) - highest quality--write-auto-sub) - usually availableIMPORTANT: Always check if yt-dlp is installed first:
which yt-dlp || command -v yt-dlp
macOS (Homebrew):
brew install yt-dlp
Alternative (pip):
pip3 install yt-dlp
ALWAYS do this first before attempting to download:
yt-dlp --list-subs "YOUTUBE_URL"
yt-dlp --write-sub --skip-download --output "/tmp/transcript_%(id)s" "YOUTUBE_URL"
yt-dlp --write-auto-sub --skip-download --output "/tmp/transcript_%(id)s" "YOUTUBE_URL"
ONLY use this if both manual and auto-generated subtitles are unavailable.
pip3 install openai-whisper)yt-dlp -x --audio-format mp3 --output "audio_%(id)s.%(ext)s" "YOUTUBE_URL"whisper audio_VIDEO_ID.mp3 --model base --output_format vttyt-dlp --print "%(title)s" "YOUTUBE_URL"
YouTube's auto-generated VTT files contain duplicate lines. Always deduplicate and save to ./yt-transcripts/:
mkdir -p ./yt-transcripts
VIDEO_TITLE=$(yt-dlp --print "%(title)s" "YOUTUBE_URL" | tr '/' '_' | tr ':' '-' | tr '?' '' | tr '"' '' | tr ' ' '_')
VTT_FILE=$(ls /tmp/*.vtt | head -n 1)
python3 -c "
import sys, re
seen = set()
with open('$VTT_FILE', 'r') as f:
for line in f:
line = line.strip()
if line and not line.startswith('WEBVTT') and not line.startswith('Kind:') and not line.startswith('Language:') and '-->' not in line:
clean = re.sub('<[^>]*>', '', line)
clean = clean.replace('&', '&').replace('>', '>').replace('<', '<')
if clean and clean not in seen:
print(clean)
seen.add(clean)
" > "./yt-transcripts/${VIDEO_TITLE}.txt"
rm "$VTT_FILE"
echo "Saved to: ./yt-transcripts/${VIDEO_TITLE}.txt"
.vtt): Includes timestamps, good for video players.txt): Just text content, good for reading/analysis| Issue | Solution |
|---|---|
| yt-dlp not installed | brew install yt-dlp or pip3 install yt-dlp |
| No subtitles available | Offer Whisper transcription (with user confirmation) |
| Invalid/private video | Check URL format, inform user of error |
| Multiple languages | Use --sub-langs en for English only |
--list-subs)