Help us improve
Share bugs, ideas, or general feedback.
From yt-is
YouTube transcript extraction via yt-dlp Python API with Chrome TLS impersonation
npx claudepluginhub enduser123/yt-isHow this skill is triggered — by the user, by Claude, or both
Slash command
/yt-is:yt-dlpThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Fast transcript extraction using yt-dlp's Python API with Chrome TLS impersonation.
Creates p5.js generative art with seeded randomness, noise fields, and interactive parameter exploration. Use for algorithmic art, flow fields, or particle systems.
Share bugs, ideas, or general feedback.
Fast transcript extraction using yt-dlp's Python API with Chrome TLS impersonation.
Uses csf/transcript.py::fetch_transcript_chain() with _fetch_via_ytdlp() as the primary method.
# Download transcripts (recommended: use yt-is fetch instead)
yt-dlp --run
# Dry run: show missing counts
yt-dlp
# Process specific channel only
yt-dlp --channel "https://youtube.com/@channel"
# Parallel workers
yt-dlp --workers 2
When yt-dlp fails, the chain escalates automatically:
| Step | Method | Source | Sleep Interval |
|---|---|---|---|
| 1 | oEmbed reachability probe | oembed | N/A |
| 2 | yt-dlp (WEB client, curl_cffi TLS) | ytdlp | 15-60s |
| 3 | yt-dlp with cookies (age-restricted) | ytdlp_ejs | 20-90s |
| 4 | Direct API | direct_api | varies |
| 5 | NotebookLM | notebooklm | varies |
| 6 | Selenium Firefox | selenium | varies |
| 7 | faster-whisper (audio) | whisper | N/A |
Primary method: _fetch_via_ytdlp() (transcript.py:646)
yt_dlp.YoutubeDL with client_name=WEB for public videoscurl_cffi for subtitle URL fetch (bypasses bot detection)_fetch_via_ytdlp_with_cookies()Second attempt: _fetch_via_ytdlp_with_cookies() (transcript.py:784)
_get_cookie_file() with reference countingexternal_downloader: "ejs:github" to resolve YouTube's JS challenge for age-restricted videos_release_cookie_file())Full chain: fetch_transcript_chain()
_fetch_via_ytdlp() — WEB client, public videos_fetch_via_ytdlp_with_cookies() — cookies + EJS, age-restricted_fetch_via_direct_api() — cheap terminal/no-transcript discriminator_fetch_via_notebooklm() — NotebookLM batch_fetch_via_selenium_firefox() — full browser, bot-blocked_fetch_via_whisper() — audio download + transcription| Error | Reason | Next Step |
|---|---|---|
| "no subtitles available" | Video has no captions | Try next language or method |
| "rate limited (429)" | Quota exceeded | Circuit breaker, skip source |
| "sign in to confirm you're not a bot" | Bot detection | Recursive fallback to cookies |
| "no firefox cookie file" | Firefox not running | Skip to Selenium |
fetch_transcript_chain()
│
├─► _fetch_via_ytdlp() ──► yt_dlp.YoutubeDL (WEB) ──► curl_cffi (chrome) ──► transcripts.sqlite
│ (15-60s sleep, curl_cffi TLS)
│
└─ On bot-check ─► _fetch_via_ytdlp_with_cookies() ──► yt_dlp + Firefox cookies + EJS
(20-90s sleep, cookie reference counting)
│
└─► On failure ─► Selenium ─► NLM ─► Whisper
_fetch_via_ytdlp() implementation_fetch_via_ytdlp_with_cookies() implementationfetch_transcript_chain() orchestration_get_firefox_cookie_file()set_cached_transcript()transcripts.sqlite (cached, keyed by video_id + lang)batch_status.sqlite (last_stage: 'ytdlp', 'ytdlp_ejs', 'selenium', 'notebooklm')yt-dlp>=2024.0.0curl_cffi (for TLS impersonation)/yt-is — Channel management + full fetch workflow/yt-nlm — NotebookLM batch transcript extraction/yt-selenium — Selenium-based fallback extraction# 1. Discover new videos (RSS + API gap fill)
/yt-is sync
# 2. Download transcripts (full fallback chain)
# Recommended: use yt-is fetch instead of yt-dlp directly
/yt-is fetch
# 3. Or use yt-dlp directly (bypasses full escalation chain)
/yt-dlp --run