Automates headless browser tasks with Vercel's agent-browser CLI: navigate URLs, snapshot interactive elements with refs (@e1), click/fill/type, scroll, test web pages.
npx claudepluginhub tmchow/tmc-marketplace --plugin iterative-engineeringThis skill uses the workspace's default tool permissions.
Vercel's headless browser CLI designed for AI agents. Uses ref-based selection (@e1, @e2) from accessibility snapshots.
Verifies tests pass on completed feature branch, presents options to merge locally, create GitHub PR, keep as-is or discard; executes choice and cleans up worktree.
Guides root cause investigation for bugs, test failures, unexpected behavior, performance issues, and build failures before proposing fixes.
Writes implementation plans from specs for multi-step tasks, mapping files and breaking into TDD bite-sized steps before coding.
Vercel's headless browser CLI designed for AI agents. Uses ref-based selection (@e1, @e2) from accessibility snapshots.
command -v agent-browser >/dev/null 2>&1 && echo "Installed" || echo "NOT INSTALLED"
npm install -g agent-browser
agent-browser install # Downloads Chromium
The snapshot + ref pattern is optimal for LLMs:
# Step 1: Open URL
agent-browser open https://example.com
# Step 2: Get interactive elements with refs
agent-browser snapshot -i
# Step 3: Interact using refs
agent-browser click @e1
agent-browser fill @e2 "search query"
# Step 4: Re-snapshot after changes
agent-browser snapshot -i
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -i --json # JSON output for parsing
agent-browser snapshot -c # Compact (remove empty elements)
agent-browser snapshot -d 3 # Limit depth
agent-browser snapshot -s @e5 # Scope to element subtree
agent-browser click @e1 # Click element
agent-browser dblclick @e1 # Double-click
agent-browser fill @e1 "text" # Clear and fill input
agent-browser type @e1 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser hover @e1 # Hover element
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "option" # Select dropdown option
agent-browser scroll down 500 # Scroll (up/down/left/right)
agent-browser scrollintoview @e1 # Scroll element into view
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get element HTML
agent-browser get value @e1 # Get input value
agent-browser get attr href @e1 # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count "button" # Count matching elements
agent-browser screenshot # Viewport screenshot
agent-browser screenshot --full # Full page
agent-browser screenshot output.png # Save to file
agent-browser pdf output.pdf # Save as PDF
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait "text" # Wait for text to appear
agent-browser wait --url "pattern" # Wait for URL match
agent-browser find role button click --name "Submit"
agent-browser find text "Sign up" click
agent-browser find label "Email" fill "user@example.com"
agent-browser find placeholder "Search..." fill "query"
agent-browser --session browser1 open https://site1.com
agent-browser --session browser2 open https://site2.com
agent-browser session list
# Profiles preserve cookies, localStorage, login sessions
agent-browser --profile ~/.myapp-profile open https://app.example.com
agent-browser open https://api.example.com --headers '{"Authorization": "Bearer <token>"}'
# After logging in via UI
agent-browser state save auth-state.json
# Reuse in future sessions
agent-browser state load auth-state.json
agent-browser open https://app.example.com # Already logged in
# Run with visible browser window
agent-browser --headed open https://example.com
agent-browser --headed snapshot -i
agent-browser --headed click @e1
agent-browser open https://app.example.com/login
agent-browser snapshot -i
# Output: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Sign in" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait 2000
agent-browser snapshot -i # Verify logged in
agent-browser open https://news.ycombinator.com
agent-browser snapshot -i
agent-browser get text @e12 # Get headline text
agent-browser click @e12 # Click to open story
agent-browser open https://forms.example.com
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser select @e3 "United States"
agent-browser check @e4 # Agree to terms
agent-browser click @e5 # Submit
agent-browser screenshot confirmation.png
agent-browser snapshot -i --json
Returns structured data with refs for programmatic parsing.
| Feature | agent-browser (CLI) | Playwright MCP |
|---|---|---|
| Interface | Bash commands | MCP tools |
| Selection | Refs (@e1) | Refs (e1) |
| Output | Text/JSON | Tool responses |
| Parallel | Sessions | Tabs |
| Best for | Quick automation | Tool integration |
Use agent-browser when:
Use Playwright MCP when: