/watch — give Claude eyes for video
Let Claude watch a video from almost anywhere — and transcribe it 100% on your machine, no API key.
Paste a URL (YouTube, Instagram, X/Twitter, Vimeo, TikTok, Loom, and ~1800 more) or a local file, ask a question, and Claude downloads the video, extracts frames, pulls a timestamped transcript, and reads every frame as an image. By the time it answers, it has seen the video and heard the audio.
/watch https://youtu.be/dQw4w9WgXcQ what happens at the 30 second mark?
This is a fork of bradautomates/claude-video. The big change: transcription runs locally via mlx-whisper (Apple Silicon) or openai-whisper (CPU) instead of a paid Whisper API. No keys, no config file, no audio ever leaves your machine. Also adds browser-cookie support so login-gated sources (Instagram, X) work.
Install
Claude Code (plugin):
/plugin marketplace add mathiaschu/claude-video
/plugin install watch@claude-video
Generic skill folder (Codex, etc.):
git clone https://github.com/mathiaschu/claude-video.git ~/.claude/skills/watch
Then run once to install dependencies (or let the first /watch do it):
python3 ~/.claude/skills/watch/scripts/setup.py
Windows: use python instead of python3 (the python3 alias is a Microsoft Store stub). ffmpeg + yt-dlp install via the winget commands the setup script prints; for transcription use pip install openai-whisper (mlx-whisper is Apple Silicon only).
Requirements
- Python 3.9+
- ffmpeg + yt-dlp — auto-installed via Homebrew on macOS; on Linux/Windows the installer prints the exact command.
- A local Whisper engine (only needed for videos without captions):
- mlx-whisper —
pip3 install mlx-whisper — preferred on Apple Silicon, runs on the GPU/Neural Engine.
- openai-whisper —
pip3 install openai-whisper — CPU fallback, cross-platform.
There is no API key. Captions cover most public videos for free; when a video has no captions, the audio is transcribed locally.
How it works
- You paste a video and a question. A URL (anything yt-dlp supports) or a local path (
.mp4, .mov, .mkv, .webm, …).
yt-dlp downloads it into a temp working directory. Local files are probed in place, no download.
ffmpeg extracts frames at an auto-scaled rate. Duration-aware budget — ≤30s gets ~30 frames, 1-3min ~60, 3-10min ~80, longer 100 sparsely. Hard ceilings: 2 fps, 100 frames. JPEGs at 512px wide (bump with --resolution 1024 to read on-screen text).
- The transcript comes from one of two places. First: yt-dlp pulls native captions (free, instant). Fallback: a mono 16 kHz audio clip is transcribed locally with mlx-whisper / openai-whisper.
- Frames + transcript are handed to Claude. It
Reads each frame as an image and aligns them to the timestamped transcript.
- Claude answers grounded in what's on screen and in the audio — not the title, not a guess.
- Cleanup. The script prints its working directory; Claude removes it when you're done.
Login-gated sources (Instagram, X, private videos)
Public videos download with no auth. For sources that require a login — Instagram, X/Twitter, age-restricted or private/unlisted YouTube — /watch needs your own browser cookies so yt-dlp can authenticate as you. You don't have to set anything up in advance: just run /watch <url> normally, and if it fails with a login/private error, do this.
The easy way — borrow cookies from your browser
- Make sure you're logged into the site (e.g. Instagram) in a normal browser.
- Tell Claude which browser it is, or run it directly:
/watch https://www.instagram.com/reel/XXXX/ --cookies-from-browser chrome
Supported: chrome, firefox, safari, edge, brave, chromium, opera, vivaldi.
macOS gotchas (the usual culprits):
- Chrome must be fully quit (Cmd-Q, not just the window) — it locks its cookie database while running.
- A Keychain prompt may appear ("… wants to use confidential information stored in Chrome Safe Storage"). Click Always Allow — that's macOS letting the tool decrypt Chrome's cookies on your own machine.
- Safari requires giving your terminal / Claude Code Full Disk Access (System Settings → Privacy & Security → Full Disk Access).
- When in doubt, Firefox tends to "just work" without closing it.
The reliable way — export a cookies.txt
If browser extraction keeps fighting you, export the cookies manually (works everywhere):