By devinilabs
Generate section-by-section Markdown study notes from any video URL or local file, embedding screenshots and timestamped transcripts, with optional focus on specific topics or questions via AI analysis. Initialize sessions by running custom bash scripts.
npx claudepluginhub devinilabs/claude-watch --plugin claude-watchTurn any tutorial or lecture video into structured study notes. Paste a URL, walk away, come back to a markdown file with embedded screenshots, timestamped transcript, and Claude's synthesis — saved to a persistent library.
/claude-watch https://youtu.be/<lecture> backprop intuition
| Surface | Command |
|---|---|
| Claude Code | /plugin marketplace add devinilabs/claude-watch then /plugin install claude-watch@claude-watch |
| claude.ai (web) | Download claude-watch.skill from the latest release → Settings → Capabilities → Skills → + |
| Codex | git clone https://github.com/devinilabs/claude-watch ~/.codex/skills/claude-watch |
yt-dlp (or accepts a local file).ffmpeg. Inserts coverage-floor frames every 45s across long static gaps so a lecture with one slide for 5 minutes still gets ~7 frames, not 1.Reads every frame as an image and writes notes.md to a strict template:
## TLDR — 3-4 sentence synthesis## Key Concepts — bulleted with timestamps## Notes — one section per scene with embedded screenshot, on-screen text, what was said, Claude's synthesis## Code & Commands — every code-on-screen frame transcribed into a runnable fenced block## Diagrams Referenced, ## Open Questions~/claude-watch/library/<slug>/ — re-running the same URL is a cache hit.claude-video's uniform frame sampling spends the budget poorly on long lectures with slow-changing slides. And answers live in chat, so you can't go back to "the notes from that video." claude-watch is opinionated for the tutorial workflow: scene-aware frames, persistent library, structured notes file.
/claude-watch <url-or-path> [topic]
/claude-watch ~/Lectures/cs231n.mp4 backpropagation derivation
/claude-watch https://youtu.be/<long> --start 5:00 --end 25:00
/claude-watch <url> --resolution 1024 # for slides with tiny code text
Flags: --start/--end, --max-frames, --resolution, --scene-threshold, --max-gap, --whisper groq|openai, --no-whisper, --out-dir.
Captions cover the majority of public videos for free. Whisper only kicks in when a video has no caption track.
| Need | Cost |
|---|---|
| Download + native captions | free (yt-dlp + ffmpeg) |
| Whisper fallback (preferred) | Groq whisper-large-v3 — cheap, fast |
| Whisper fallback (alt) | OpenAI whisper-1 |
| Disable Whisper | --no-whisper (frames-only when no captions) |
Keys go in ~/.config/claude-watch/.env (mode 0600).
The library is keyed on slug = YYYY-MM-DD-<title>-<short-hash> where the short hash is sha1(source + focus_range)[:4]. Re-running the same URL with the same focus range hits the cache — no re-download, no re-transcribe, only frames + notes regenerate. Different focus range = different slug = a separate notes file.
To force a fresh run, delete the meta.json in the library dir.
--start/--end to focus.--max-frames (token cost grows linearly).git clone https://github.com/devinilabs/claude-watch
cd claude-watch
python3 -m pytest # full suite
bash scripts/build-skill.sh # → dist/claude-watch.skill (claude.ai bundle)
Releasing: tag vX.Y.Z, push the tag — CI builds and attaches claude-watch.skill.
MIT. Built on yt-dlp, ffmpeg, and Claude's multimodal Read tool. Whisper transcription via Groq or OpenAI.
Let Claude watch a video. Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls captions or falls back to Whisper, and hands frames + transcript to Claude so it can answer questions about the video.
Give Claude the ability to watch and understand videos — extracts frames and audio for full video perception
Summarize videos, audio, and podcasts via BibiGPT CLI directly in the terminal
Download videos from 1800+ platforms and generate AI summaries with complete resource packages
Claude Code plugin for video analysis, deep research, content extraction, web search, and explainer video creation — powered by Gemini 3.1 Pro.
Share bugs, ideas, or general feedback.
Transcribing Lessons from Tongji Look Platform to Notes with Agent Skill.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claim