Marketplace

vlmrun-skills

Agent Skills for visual AI tasks including image understanding, video processing, document extraction, and multi-modal generation using VLM Run's Orion agent

Component Overview

Commands

Agents

Skills

Hooks

MCP Servers

LSP Servers

Output Styles

Themes

Monitors

Install

npx claudepluginhub vlm-run/skills

README

View full README on GitHub

1 Plugin

vlmrun-skills

Agent Skills for visual AI tasks including image understanding, video processing, document extraction, and multi-modal generation using VLM Run's Orion agent

v1.0.0

Related Marketplaces

nextjs

139.2K

0plugins

No description available.

thedotmack

68.3K

0plugins

Plugins by Alex Newman (thedotmack)

bmad-method

45.8K

0plugins

Breakthrough Method of Agile AI-driven Development — a full-lifecycle framework with agents and workflows for analysis, planning, architecture, and implementation.

Stats

Plugins1

Stars7

Installs1

UpdatedApr 16, 2026

Links

View on GitHub View Marketplace JSON

VLM Run Skills

Website | Platform | Docs | Blog | Discord

VLM Run Skills are definitions for visual AI tasks like image understanding, video processing, and document extraction. They are interoperable with Anthropic's Claude Code.

The Skills in this repository follow the standardized Agent Skill format.

How do Skills work?

In practice, skills are self-contained folders that package instructions, scripts, and resources together for an AI agent to use on a specific use case. Each folder includes a SKILL.md file with YAML frontmatter (name and description) followed by the guidance your coding agent follows while the skill is active.

Features

Image Intelligence

Understanding & Captioning: Describe, analyze, and interpret images with state-of-the-art visual intelligence
Detection & Localization: Detect and locate objects, people, faces, and custom entities with bounding boxes
Segmentation: Segment objects, scenes, and regions with pixel-level precision
Generation & Editing: Generate images from text, edit existing images, apply super-resolution, colorize B&W photos
Tools: Crop, rotate, enhance resolution (4x-8x upscaling), de-oldify (colorization)
Visual Grounding: Point to and extract specific elements using natural language queries
UI Parsing: Extract UI elements, layouts, and hierarchies from screenshots

Video Intelligence

Understanding & Captioning: Describe video content, generate summaries and detailed scene analysis
Transcription: Extract audio transcripts with timestamps
Tools: Trim videos, extract keyframes, sample frames at intervals, detect highlights
Segmentation: Identify and segment objects across video frames
Generation & Editing: Generate videos from text prompts, edit existing videos

Document Intelligence

Layout Understanding: Detect headers, paragraphs, tables, figures, lists, and structural elements
Multi-Page Analysis: Process and analyze PDFs with intelligent page-aware extraction
Markdown Extraction: Convert documents to clean, structured markdown with preserved formatting
Visual Grounding: Locate and extract specific fields, sections, or data points
Data Extraction: Extract key information from invoices, receipts, contracts, forms into structured JSON

Multi-modal Agents

Multi-Modal Reasoning: Execute complex multi-step workflows across images, documents, and videos
Structured Outputs: Get results in validated JSON schemas with automatic retry logic

See docs and technical whitepaper for more information.

Installation

Prerequisites

Get your VLM Run API key from app.vlm.run
Have uv installed for Python environment management

Claude Code

/plugin marketplace add vlm-run/skills

To install a skill, run:

/plugin install <skill-name>@vlm-run/skills

For example:

/plugin install vlmrun-cli-skill@vlm-run/skills

Configure your API key

Once the skill is installed, configure your API key using the CLI (get your key from app.vlm.run):

vlmrun config init
vlmrun config set --api-key <your-api-key>
vlmrun config show

Verify Installation

Once installed, verify the skill is loaded by asking Claude Code (requires restart):

What skills are available in the /vlmrun-cli-skill?

vlmrun-skills

Component Overview

Install

README

1 Plugin

vlmrun-skills

Related Marketplaces

nextjs

thedotmack

bmad-method

vlmrun-skills

Component Overview

Install

README

VLM Run Skills

How do Skills work?

Features

Image Intelligence

Video Intelligence

Document Intelligence

Multi-modal Agents

Installation

Prerequisites

Claude Code

Configure your API key

Verify Installation

Installing in Claude for Desktop

1 Plugin

vlmrun-skills

Related Marketplaces

nextjs

thedotmack

bmad-method

VLM Run Skills

How do Skills work?

Features

Image Intelligence

Video Intelligence

Document Intelligence

Multi-modal Agents

Installation

Prerequisites

Claude Code

Configure your API key

Verify Installation

Installing in Claude for Desktop