watch-cli

Watch any social video → get an architecture diagram, working component, runnable notebook, or step-by-step cheat sheet — automatically.

Eyes and ears for your AI agent. watch-cli composes yt-dlp + ffmpeg + a Whisper-class ASR into a single command that hands an agent the raw materials to "watch" any video: VIDEO + FRAMES + TRANSCRIPT, ready for an LLM to read frames as images and transcript as text.

watch https://twitter.com/anyone/status/12345

Works on YouTube, X, LinkedIn, TikTok, Reddit, Vimeo, and Facebook. Login-walled posts (LinkedIn, private X, FB) fall back to your browser cookies automatically.

What you can build

Hand the watch output to your agent with one of five prompts in prompts/:

Drop in a video of…	Get back
A coding walkthrough	Working project files
A system architecture talk	Interactive architecture diagram
A UI / motion demo	Working React component
A paper or research talk	Runnable notebook
A long tutorial	Step-by-step cheat sheet

The prompt library is what turns "video → frames + transcript" into "video → working artifact". The full Prompt library section below has copy-paste templates.

Why this exists

Large language models can't watch video natively — they read text and look at still images. You can hand a video to a multimodal API and get back a chat-style summary, but for an agent workflow that's the wrong artifact: the agent wants the raw frames and the full transcript so it can reason for itself, not someone else's pre-digested recap.

A video is just frames + audio, and each piece already has a fast, near-free primitive:

yt-dlp downloads from any social platform
ffmpeg extracts evenly-spaced frames
An ASR model transcribes the audio
A multimodal LLM hears tone, music, SFX, language, mood

Compose them and your agent has the materials to watch any social video.

What it looks like

$ watch https://www.linkedin.com/posts/some-talk_activity-12345

VIDEO: /tmp/dl-video/abc123.mp4
DURATION: 218
FRAMES:
  /tmp/frames_abc123/frame_01.jpg
  /tmp/frames_abc123/frame_02.jpg
  …
TRANSCRIPT:
  Today I want to talk about how decomposition unlocks 10× cost reduction in
  multimodal pipelines …

Your agent reads the JPGs and the transcript. That's the whole watch.

Why pay-per-use, not subscription

Most subscription summary tools start around $15/month and deliver a polished, human-readable summary. If you're feeding an AI agent, that's the wrong artifact — agents need raw frames and the full transcript to reason for themselves, not someone else's pre-digested recap.

A typical research session is 1–3 videos, not 100. Through Kyma — the default backend — a 1-hour video costs ~$0.05 (transcribe is the only paid step; frame extraction is local ffmpeg).

This month you watch	You pay
0 videos	$0
1 one-hour video	~$0.05
100 one-hour videos	~$5

No monthly minimum, no seat license, no lock-in. The free credit at Kyma signup is enough to run the full pipeline end-to-end before you spend a cent.

Install

# macOS — Homebrew (recommended)
brew tap sonpiaz/tap
brew install watch-cli

# Any OS — curl
curl -fsSL https://github.com/sonpiaz/watch-cli/releases/latest/download/install.sh | bash

The curl one-liner auto-falls back to git clone of main if no published release tarball is reachable.

Claude Code (skill marketplace)

If you use Claude Code, install watch-cli as a skill:

/plugin marketplace add sonpiaz/watch-cli
/plugin install watch-cli@watch-cli

The agent then picks up watch <url> as a first-class command.

Pin a specific version:

WATCH_CLI_VERSION=0.3.0 curl -fsSL \
  https://github.com/sonpiaz/watch-cli/releases/download/v0.3.0/install.sh | bash

Or from a clone:

git clone https://github.com/sonpiaz/watch-cli ~/.watch-cli
cd ~/.watch-cli && ./install.sh

The installer checks for yt-dlp, ffmpeg, jq, curl, python3 and symlinks the commands into ~/.local/bin. On macOS:

brew install yt-dlp ffmpeg jq

On Debian/Ubuntu:

sudo apt install yt-dlp ffmpeg jq python3 curl

Optional install flags

./install.sh --with-skill   # also drop SKILL.md into ~/.claude/skills/watch-cli/
./install.sh --with-mcp     # print the npm install hint for the MCP stdio server

watch-cli

Popularity

What's Inside

README

watch-cli

What you can build

Why this exists

What it looks like

Why pay-per-use, not subscription

Install

Claude Code (skill marketplace)

Optional install flags

Confidence

Similar Plugins

watch

claude-watch

claude-video-vision

peepshow

youtube-transcribe-skill

videodb

Popularity

Health & Quality

Similar Plugins

watch

claude-watch

claude-video-vision

peepshow

youtube-transcribe-skill

videodb