Give Claude eyes for video
npx claudepluginhub binyamineden/letmewatchGive Claude eyes for video. Extracts key frames using scene detection and lets Claude analyze screen recordings, bug reports, tutorials, and any video content.
Share bugs, ideas, or general feedback.
Extract key frames with scene detection.
Claude sees every meaningful moment — and skips the rest.
| 1 Skill | 3 Commands | Scene Detection via ffmpeg | Audio via whisper |
/plugin marketplace add BinyaminEden/letmewatch
You send Claude a video. Claude watches it.
You: /letmewatch:video ~/Desktop/bug-recording.mov
Claude: Extracting frames... 15 scene changes detected.
At 00m00s: The app loads with the dashboard view.
At 00m12s: User clicks "Submit".
At 00m13s: Loading spinner appears.
At 00m18s: Error toast — "Failed to save" (500).
At 00m23s: Form resets, user input is lost.
The bug: API returns 500 on submit, and the form
clears input instead of preserving it.
Instead of processing video natively, letmewatch uses ffmpeg scene detection — it only captures frames where the visuals actually change. No idle screens, no wasted frames.
frame_01m23s.jpg)# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
# Windows
winget install ffmpeg
# or: choco install ffmpeg
Via marketplace (recommended):
/plugin marketplace add BinyaminEden/letmewatch
/plugin install letmewatch@letmewatch
Via local clone:
git clone https://github.com/BinyaminEden/letmewatch.git ~/letmewatch
# Then in Claude Code:
/plugin install --path ~/letmewatch
# Standard
pip install openai-whisper
# Apple Silicon (faster)
pip install mlx-whisper
| Command | What it does |
|---|---|
/letmewatch:video <path> | Analyze a specific video file |
/letmewatch:video-last | Analyze the most recent video in your configured folder |
/letmewatch:video-dir <path> | Set/change where video-last looks for videos |
The plugin uses ffmpeg's scene detection filter to find frames where visuals change:
ffmpeg -vf "select='gt(scene,0.1)'" ...
This captures page navigations, modal dialogs, form submissions, error states, tab switches — any meaningful UI change. Static moments produce zero frames.
If scene detection finds fewer than 3 changes (e.g., a very static video or gradual scrolling), it automatically falls back to extracting frames at regular intervals:
| Video length | Interval |
|---|---|
| < 30 seconds | Every 2s |
| 30s – 3 min | Every 5s |
| > 3 min | Every 10s |
If whisper or mlx-whisper is installed, the plugin extracts and transcribes audio automatically. Claude sees both the visual frames and the spoken content.