Audit web UIs from URLs, screenshots, or images for usability heuristics, WCAG accessibility compliance, visual consistency, component reusability, and design system maturity. Generate prioritized reports with remediation roadmaps using AI vision CLI commands and Playwright-powered agents without writing code.
Use this command to verify WCAG 2.1/3.0 compliance with detailed remediation guidance and assistive technology assessment
Use this command to analyze component reusability patterns and identify consolidation opportunities
Use this command to calculate design debt ratio and assess design system maturity level
Use this command to run a comprehensive design audit across heuristics, accessibility, visual consistency, and design system governance
Use this command to validate design token compliance (single image mode) or detect visual regressions (regression mode)
Use this agent when you need deep WCAG 2.1/3.0 compliance verification, assistive technology assessment, and accessibility remediation guidance
Use this agent when you need comprehensive design evaluation across heuristics, accessibility, visual consistency, and design system governance
Use this agent when you need design system maturity assessment, component reusability analysis, and design debt evaluation
Use this agent when you need design token compliance validation or visual regression detection
Use this skill when aggregating design audit findings across multiple dimensions, deduplicating related issues, or synthesizing results into executive summaries with remediation roadmaps
Use this skill when assessing design system maturity, evaluating adoption health, analyzing component duplication, or calculating design debt ratios
Use playwright-cli to capture full-page screenshots, navigate complex flows, run Lighthouse audits, and test responsive design without writing code
Use when validating design system token compliance, detecting visual regressions, or analyzing responsive design consistency across breakpoints and modes
Use this skill when verifying WCAG 2.1/3.0 compliance, mapping accessibility violations to specific criteria, or providing remediation guidance with code examples
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A powerful Model Context Protocol (MCP) server that provides AI-powered image and video analysis using Google Gemini and Vertex AI models.
You could choose either to use google provider or vertex_ai provider. For simplicity, google provider is recommended.
Below are the environment variables you need to set based on your selected provider. (Note: It’s recommended to set the timeout configuration to more than 5 minutes for your MCP client).
(i) Using Google AI Studio Provider
export IMAGE_PROVIDER="google" # or vertex_ai
export VIDEO_PROVIDER="google" # or vertex_ai
export GEMINI_API_KEY="your-gemini-api-key"
Get your Google AI Studio's api key here
(ii) Using Vertex AI Provider
export IMAGE_PROVIDER="vertex_ai"
export VIDEO_PROVIDER="vertex_ai"
export VERTEX_CLIENT_EMAIL="your-service-account@project.iam.gserviceaccount.com"
export VERTEX_PRIVATE_KEY="[REDACTED:Private Key]\n"
export VERTEX_PROJECT_ID="your-gcp-project-id"
export GCS_BUCKET_NAME="your-gcs-bucket"
Refer to the guideline here on how to set this up.
Below are the installation guide for this MCP on different MCP clients, such as Claude Desktop, Claude Code, Cursor, Cline, etc.
Add to your Claude Desktop configuration:
(i) Using Google AI Studio Provider
{
"mcpServers": {
"ai-vision-mcp": {
"command": "npx",
"args": ["ai-vision-mcp"],
"env": {
"IMAGE_PROVIDER": "google",
"VIDEO_PROVIDER": "google",
"GEMINI_API_KEY": "your-gemini-api-key"
}
}
}
}
(ii) Using Vertex AI Provider
{
"mcpServers": {
"ai-vision-mcp": {
"command": "npx",
"args": ["ai-vision-mcp"],
"env": {
"IMAGE_PROVIDER": "vertex_ai",
"VIDEO_PROVIDER": "vertex_ai",
"VERTEX_CLIENT_EMAIL": "your-service-account@project.iam.gserviceaccount.com",
"VERTEX_PRIVATE_KEY": "[REDACTED:Private Key]\n",
"VERTEX_PROJECT_ID": "your-gcp-project-id",
"GCS_BUCKET_NAME": "ai-vision-mcp-{VERTEX_PROJECT_ID}"
}
}
}
}
(i) Using Google AI Studio Provider
claude mcp add ai-vision-mcp \
-e IMAGE_PROVIDER=google \
-e VIDEO_PROVIDER=google \
-e GEMINI_API_KEY=your-gemini-api-key \
-- npx ai-vision-mcp
(ii) Using Vertex AI Provider
claude mcp add ai-vision-mcp \
-e IMAGE_PROVIDER=vertex_ai \
-e VIDEO_PROVIDER=vertex_ai \
-e VERTEX_CLIENT_EMAIL=your-service-account@project.iam.gserviceaccount.com \
-e VERTEX_PRIVATE_KEY="[REDACTED:Private Key]\n" \
-e VERTEX_PROJECT_ID=your-gcp-project-id \
-e GCS_BUCKET_NAME=ai-vision-mcp-{VERTEX_PROJECT_ID} \
-- npx ai-vision-mcp
Note: Increase the MCP startup timeout to 1 minutes and MCP tool execution timeout to about 5 minutes by updating ~\.claude\settings.json as follows:
{
"env": {
"MCP_TIMEOUT": "60000",
"MCP_TOOL_TIMEOUT": "300000"
}
}
Go to: Settings -> Cursor Settings -> MCP -> Add new global MCP server
Pasting the following configuration into your Cursor ~/.cursor/mcp.json file is the recommended approach. You may also install in a specific project by creating .cursor/mcp.json in your project folder. See Cursor MCP docs for more info.
(i) Using Google AI Studio Provider
{
"mcpServers": {
"ai-vision-mcp": {
"command": "npx",
"args": ["ai-vision-mcp"],
"env": {
"IMAGE_PROVIDER": "google",
"VIDEO_PROVIDER": "google",
"GEMINI_API_KEY": "your-gemini-api-key"
}
}
}
}
Comprehensive design audit, accessibility, and consistency evaluation using Gemini CLI
Claude Code plugin with ntfy hooks, serper web search/skills, and ai-vision-mcp integration
npx claudepluginhub tan-yong-sheng/ai-vision-mcp --plugin design-evalComprehensive design audit, accessibility, and consistency evaluation using Gemini CLI
Image and visual analysis with screenshot interpretation and text extraction
Research-backed visual design principles for websites, presentations, documents, and any visual medium — grounded in VisAWI, Gestalt psychology, and empirical aesthetics research
AI-powered accessibility analysis - Interactive accessibility audit skill and automated agent with comprehensive WCAG compliance detection and reporting
UX co-pilot — conversational UX designer with live preview. 376 rules, 161 palettes, 57 font pairings, 67 styles. Phases: Discovery (ABCD questions), Audit (code scan + scored report), Preview (local server + named versions), Export (spec + React/Svelte/Vue components).
Systematic visual UI/UX audits producing phased, implementation-ready design plans. Purely visual refinement without touching functionality.