Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By varunr89
Extract text from images, PDFs, and videos using Apple Vision, tesseract, and MLX OCR
npx claudepluginhub varunr89/claude-marketplace --plugin ocr-toolkitExtract text from a directory of JPG/JPEG images into a single Markdown file using Apple Vision, tesseract, or MLX OCR
Extract text from PDF files using Apple Vision OCR, optimized for Apple Silicon
Extract text from video files by sampling frames and running Apple Vision OCR, with optional perceptual deduplication
Claude Code plugins for OCR, scheduling, flight search, transcription, and developer workflows.
claude plugin marketplace add varunr89/claude-marketplace
Then install individual plugins:
claude plugin install <plugin-name>
| Plugin | Description |
|---|---|
| codex-collab | Automated Codex CLI reviews during design and implementation |
| ocr-toolkit | Extract text from images, PDFs, and videos |
| when2meet | Create When2Meet events and pre-fill availability |
| safari-archiver | Archive Safari pages to Obsidian as clean markdown |
| transcription | Fast audio transcription using MLX Whisper |
| flight-optimizer | Multi-leg flight search with scoring |
| scenario-test | Azure infrastructure scenario testing |
| config-sync | Sync ~/.claude/ config via git |
| phone-a-friend | Consult other AI models for second opinions |
| resume-tailoring | Tailored resumes with branching experience discovery |
| devlog | Auto-generate dev log blog posts |
CI (every PR): Static validation + fresh install E2E on macos-14 ($0 API cost)
# Run locally
bash ci/test-fresh-install.sh "$PWD"
plugins/.claude-plugin/plugin.json with name, version, descriptionskills/*/SKILL.md with YAML frontmattertests/platform.json if plugin has OS/arch requirements.claude-plugin/marketplace.json to register the pluginShare bugs, ideas, or general feedback.
Based on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Agent Skills for visual AI tasks including image understanding, video processing, document extraction, and multi-modal generation using VLM Run's Orion agent
Turn videos into a sequence of relevant still frames + transcript + a self-contained HTML report so Claude can view them as images, hear the audio, and write its analysis back into the report. Pass a local path, an http(s) URL, or pipe video bytes on stdin.
Image and visual analysis with screenshot interpretation and text extraction
Let Claude watch a video from YouTube, Instagram, X, Vimeo, TikTok or any yt-dlp site. Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, transcribes locally with mlx-whisper (no API key, nothing leaves your machine), and hands frames + transcript to Claude.
Compose yt-dlp + ffmpeg + Whisper into a single command that hands an AI agent the raw materials to watch any social video — VIDEO + FRAMES + TRANSCRIPT, ready for an LLM to read frames as images and transcript as text.
Give Claude the ability to watch and understand videos — extracts frames and audio for full video perception
Tailored resumes with company research, branching experience discovery, and multi-format output
Archive Safari pages and PDFs to Obsidian as clean markdown with images and frontmatter
Validate code changes against Azure infrastructure with isolated scenario tests
Fast audio transcription using MLX Whisper on Apple Silicon with GPU/Neural Engine acceleration
Automatically sync ~/.claude/ configuration to a git repo on session start and end
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claim