By vercel-labs
Automate browser tasks via Bash or CLI commands: navigate pages, interact with elements via clicks/fills/types, capture screenshots, extract text/HTML for web testing, form submission, data scraping, Electron app control, Slack workflows. Deploy on Vercel or AWS Bedrock for AI agents.
npx claudepluginhub vercel-labs/agent-browser --plugin agent-browserBrowser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
Run agent-browser on AWS Bedrock AgentCore cloud browsers. Use when the user wants to use AgentCore, run browser automation on AWS, use a cloud browser with AWS credentials, or needs a managed browser session backed by AWS infrastructure. Triggers include "use agentcore", "run on AWS", "cloud browser with AWS", "bedrock browser", "agentcore session", or any task requiring AWS-hosted browser automation.
Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.
Automate Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify, etc.) using agent-browser via Chrome DevTools Protocol. Use when the user needs to interact with an Electron app, automate a desktop app, connect to a running app, control a native app, or test an Electron application. Triggers include "automate Slack app", "control VS Code", "interact with Discord app", "test this Electron app", "connect to desktop app", or any task requiring automation of a native Electron application.
Interact with Slack workspaces using browser automation. Use when the user needs to check unread channels, navigate Slack, send messages, extract data, find information, search conversations, or automate any Slack task. Triggers include "check my Slack", "what channels have unreads", "send a message to", "search Slack for", "extract from Slack", "find who said", or any task requiring programmatic Slack interaction.
Run agent-browser + Chrome inside Vercel Sandbox microVMs for browser automation from any Vercel-deployed app. Use when the user needs browser automation in a Vercel app (Next.js, SvelteKit, Nuxt, Remix, Astro, etc.), wants to run headless Chrome without binary size limits, needs persistent browser sessions across commands, or wants ephemeral isolated browser environments. Triggers include "Vercel Sandbox browser", "microVM Chrome", "agent-browser in sandbox", "browser automation on Vercel", or any task requiring Chrome in a Vercel Sandbox.
Browser automation for web testing, form filling, screenshots, and data extraction via CLI
AI-powered browser automation -- lets Claude control real web browsers to navigate, click, type, extract content, and automate workflows
Development tool providing browser-based environment for testing and debugging web applications and scripts.
Browser automation with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Share bugs, ideas, or general feedback.
Browser automation CLI for AI agents. Fast native Rust CLI.
Installs the native Rust binary:
npm install -g agent-browser
agent-browser install # Download Chrome from Chrome for Testing (first time only)
For projects that want to pin the version in package.json:
npm install agent-browser
agent-browser install
Then use via package.json scripts or by invoking agent-browser directly.
brew install agent-browser
agent-browser install # Download Chrome from Chrome for Testing (first time only)
cargo install agent-browser
agent-browser install # Download Chrome from Chrome for Testing (first time only)
git clone https://github.com/vercel-labs/agent-browser
cd agent-browser
pnpm install
pnpm build
pnpm build:native # Requires Rust (https://rustup.rs)
pnpm link --global # Makes agent-browser available globally
agent-browser install
On Linux, install system dependencies:
agent-browser install --with-deps
Upgrade to the latest version:
agent-browser upgrade
Detects your installation method (npm, Homebrew, or Cargo) and runs the appropriate update command automatically.
agent-browser install to download Chrome from Chrome for Testing (Google's official automation channel). Existing Chrome, Brave, Playwright, and Puppeteer installations are detected automatically. No Playwright or Node.js required for the daemon.agent-browser open example.com
agent-browser snapshot # Get accessibility tree with refs
agent-browser click @e2 # Click by ref from snapshot
agent-browser fill @e3 "test@example.com" # Fill by ref
agent-browser get text @e1 # Get text by ref
agent-browser screenshot page.png
agent-browser close
agent-browser click "#submit"
agent-browser fill "#email" "test@example.com"
agent-browser find role button click --name "Submit"
agent-browser open <url> # Navigate to URL (aliases: goto, navigate)
agent-browser click <sel> # Click element (--new-tab to open in new tab)
agent-browser dblclick <sel> # Double-click element
agent-browser focus <sel> # Focus element
agent-browser type <sel> <text> # Type into element
agent-browser fill <sel> <text> # Clear and fill
agent-browser press <key> # Press key (Enter, Tab, Control+a) (alias: key)
agent-browser keyboard type <text> # Type with real keystrokes (no selector, current focus)
agent-browser keyboard inserttext <text> # Insert text without key events (no selector)
agent-browser keydown <key> # Hold key down
agent-browser keyup <key> # Release key
agent-browser hover <sel> # Hover element
agent-browser select <sel> <val> # Select dropdown option
agent-browser check <sel> # Check checkbox
agent-browser uncheck <sel> # Uncheck checkbox
agent-browser scroll <dir> [px] # Scroll (up/down/left/right, --selector <sel>)
agent-browser scrollintoview <sel> # Scroll element into view (alias: scrollinto)
agent-browser drag <src> <tgt> # Drag and drop
agent-browser upload <sel> <files> # Upload files
agent-browser screenshot [path] # Take screenshot (--full for full page, saves to a temporary directory if no path)
agent-browser screenshot --annotate # Annotated screenshot with numbered element labels
agent-browser screenshot --screenshot-dir ./shots # Save to custom directory
agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
agent-browser pdf <path> # Save as PDF
agent-browser snapshot # Accessibility tree with refs (best for AI)
agent-browser eval <js> # Run JavaScript (-b for base64, --stdin for piped input)
agent-browser connect <port> # Connect to browser via CDP
agent-browser stream enable [--port <port>] # Start runtime WebSocket streaming
agent-browser stream status # Show runtime streaming state and bound port
agent-browser stream disable # Stop runtime WebSocket streaming
agent-browser close # Close browser (aliases: quit, exit)
agent-browser close --all # Close all active sessions
agent-browser chat "<instruction>" # AI chat: natural language browser control (single-shot)
agent-browser chat # AI chat: interactive REPL mode