Browser automation and testing using Vercel's agent-browser CLI with ref-based element targeting
From caspernpx claudepluginhub casper-studios/casper-marketplace --plugin casperThis skill uses the workspace's default tool permissions.
references/authentication.mdreferences/commands.mdreferences/snapshot-workflow.mdreferences/testing-patterns.mdscripts/browser_test.pySearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Browser automation and end-to-end testing using Vercel's agent-browser CLI. Uses ref-based element targeting for reliable, AI-friendly browser interaction.
What do you need?
│
├─ Take a screenshot of a page?
│ └─ agent-browser open [url] && agent-browser screenshot
│
├─ Fill out a form?
│ └─ open → snapshot -i → fill @ref → click @submit → snapshot
│
├─ Test a login flow?
│ └─ See references/authentication.md
│
├─ Run an E2E test?
│ └─ See references/testing-patterns.md
│
├─ Scrape page content?
│ └─ agent-browser open [url] && agent-browser snapshot -i
│
└─ Debug element targeting?
└─ agent-browser snapshot -i --format json
# Install agent-browser globally
npm install -g agent-browser
# Install browser dependencies (Chromium)
agent-browser install
# Verify installation
agent-browser --version
Agent-browser uses refs (like @e1, @e2, @e3) to identify interactive elements on the page. These refs are assigned when you take a snapshot.
# Take a snapshot with interactive elements labeled
agent-browser snapshot -i
# Output shows refs:
# @e1: [button] "Sign In"
# @e2: [input] Email field
# @e3: [input] Password field
# @e4: [button] "Submit"
# Use refs to interact
agent-browser click @e1
agent-browser fill @e2 "user@example.com"
Important: Refs are session-specific and invalidate when the page changes. Always re-snapshot after navigation or DOM updates.
# 1. Open the target URL
agent-browser open https://example.com
# 2. Take a snapshot to see the page and get refs
agent-browser snapshot -i
# 3. Interact with elements using refs
agent-browser click @e1
agent-browser fill @e2 "test value"
# 4. Take another snapshot to verify changes
agent-browser snapshot -i
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser refresh # Reload page
agent-browser snapshot # Text snapshot
agent-browser snapshot -i # With interactive refs
agent-browser snapshot --format json # JSON output
agent-browser screenshot [path] # Save screenshot
agent-browser click @ref # Click element
agent-browser fill @ref "value" # Fill input field
agent-browser select @ref "option" # Select dropdown option
agent-browser hover @ref # Hover over element
agent-browser press Enter # Press keyboard key
agent-browser find role button "Submit" # Find by ARIA role
agent-browser find text "Welcome" # Find by visible text
agent-browser find label "Email" # Find by label
agent-browser wait visible @ref # Wait for element visible
agent-browser wait hidden @ref # Wait for element hidden
agent-browser wait network # Wait for network idle
agent-browser wait time 2000 # Wait milliseconds
agent-browser session save mystate # Save browser state
agent-browser session load mystate # Load saved state
agent-browser session list # List saved sessions
agent-browser close # Close browser
Never commit these files:
*.state - Browser session state files contain cookiesagent-browser-profile/ - Profile directories with credentialsAdd to .gitignore:
*.state
agent-browser-profile/
.agent-browser/
screenshots/
# Research a topic, then verify claims on websites
parallel_research.py chat "Find pricing for Acme Corp"
# Then use agent-browser to verify on their actual pricing page
agent-browser open https://acme.com/pricing
agent-browser snapshot -i
# Take baseline screenshots for visual regression
agent-browser open https://myapp.com
agent-browser screenshot baseline.png
# After changes, compare
agent-browser screenshot current.png
# Use image comparison tool
# Load test data from Google Sheets, run form tests
import subprocess
test_data = get_sheet_data("Form Test Cases")
for row in test_data:
subprocess.run(["agent-browser", "fill", "@email", row["email"]])
subprocess.run(["agent-browser", "fill", "@password", row["password"]])
subprocess.run(["agent-browser", "click", "@submit"])
references/commands.md - Full command referencereferences/authentication.md - Login flow patternsreferences/testing-patterns.md - E2E test workflowsreferences/snapshot-workflow.md - Ref system deep divescripts/browser_test.py - Python automation wrapper# Open the registration page
agent-browser open https://example.com/register
# Get element refs
agent-browser snapshot -i
# Fill the form (refs from snapshot output)
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "SecurePass123!"
agent-browser select @e4 "United States"
agent-browser click @e5 # Terms checkbox
agent-browser click @e6 # Submit button
# Wait for navigation and verify
agent-browser wait network
agent-browser snapshot -i
# Take confirmation screenshot
agent-browser screenshot registration-success.png
Element not found:
snapshot -i to get fresh refsagent-browser find text "Submit"Page not loading:
agent-browser open <url> --timeout 30000agent-browser wait networkSession expired:
agent-browser session save backupagent-browser session load backup