Skill

agent-browser

Automates browser tasks like E2E testing, form filling, screenshots, and scraping using Vercel's agent-browser CLI with ref-based element targeting.

Vercel

npm

Bash

testing

automation

npx claudepluginhub casper-studios/casper-marketplace --plugin casper

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/casper:agent-browser

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Browser automation and end-to-end testing using Vercel's agent-browser CLI. Uses ref-based element targeting for reliable, AI-friendly browser interaction.

Supporting Files

references/authentication.mdreferences/commands.mdreferences/snapshot-workflow.mdreferences/testing-patterns.mdscripts/browser_test.py

SKILL.md

240 lines · ~1.6k tokens

Similar Skills

agent-browser

Automates headless browser tasks with Vercel's agent-browser CLI: navigate URLs, snapshot interactive elements with refs (@e1), click/fill/type, scroll, test web pages.

iterative-engineering

agent-browser

Controls a headless browser via Vercel's agent-browser CLI for navigation, form filling, screenshots, and scraping using accessibility refs.

lavra

agent-browser

Automates headless browser tasks via Vercel's agent-browser CLI: navigate URLs, snapshot interactive elements with refs, click/fill forms, scrape data using Bash commands.

compound-engineering

Stats

LanguagePython

Parent stars11

Parent forks2

MaintenanceExcellent

Last CommitMar 2, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Agent Browser Testing Skill

Browser automation and end-to-end testing using Vercel's agent-browser CLI. Uses ref-based element targeting for reliable, AI-friendly browser interaction.

Quick Decision Tree

What do you need?
│
├─ Take a screenshot of a page?
│  └─ agent-browser open [url] && agent-browser screenshot
│
├─ Fill out a form?
│  └─ open → snapshot -i → fill @ref → click @submit → snapshot
│
├─ Test a login flow?
│  └─ See references/authentication.md
│
├─ Run an E2E test?
│  └─ See references/testing-patterns.md
│
├─ Scrape page content?
│  └─ agent-browser open [url] && agent-browser snapshot -i
│
└─ Debug element targeting?
   └─ agent-browser snapshot -i --format json

Installation

# Install agent-browser globally
npm install -g agent-browser

# Install browser dependencies (Chromium)
agent-browser install

# Verify installation
agent-browser --version

Core Concept: Ref-Based Targeting

Agent-browser uses refs (like @e1, @e2, @e3) to identify interactive elements on the page. These refs are assigned when you take a snapshot.

# Take a snapshot with interactive elements labeled
agent-browser snapshot -i

# Output shows refs:
# @e1: [button] "Sign In"
# @e2: [input] Email field
# @e3: [input] Password field
# @e4: [button] "Submit"

# Use refs to interact
agent-browser click @e1
agent-browser fill @e2 "user@example.com"

Important: Refs are session-specific and invalidate when the page changes. Always re-snapshot after navigation or DOM updates.

Essential Workflow

# 1. Open the target URL
agent-browser open https://example.com

# 2. Take a snapshot to see the page and get refs
agent-browser snapshot -i

# 3. Interact with elements using refs
agent-browser click @e1
agent-browser fill @e2 "test value"

# 4. Take another snapshot to verify changes
agent-browser snapshot -i

Common Commands Quick Reference

Navigation

agent-browser open <url>              # Navigate to URL
agent-browser back                    # Go back
agent-browser forward                 # Go forward
agent-browser refresh                 # Reload page

Snapshots

agent-browser snapshot                # Text snapshot
agent-browser snapshot -i             # With interactive refs
agent-browser snapshot --format json  # JSON output
agent-browser screenshot [path]       # Save screenshot

Interaction

agent-browser click @ref              # Click element
agent-browser fill @ref "value"       # Fill input field
agent-browser select @ref "option"    # Select dropdown option
agent-browser hover @ref              # Hover over element
agent-browser press Enter             # Press keyboard key

Semantic Locators

agent-browser find role button "Submit"    # Find by ARIA role
agent-browser find text "Welcome"          # Find by visible text
agent-browser find label "Email"           # Find by label

Waiting

agent-browser wait visible @ref            # Wait for element visible
agent-browser wait hidden @ref             # Wait for element hidden
agent-browser wait network                 # Wait for network idle
agent-browser wait time 2000               # Wait milliseconds

Session Management

agent-browser session save mystate         # Save browser state
agent-browser session load mystate         # Load saved state
agent-browser session list                 # List saved sessions
agent-browser close                        # Close browser

Security Notes

Never commit these files:

*.state - Browser session state files contain cookies
agent-browser-profile/ - Profile directories with credentials
Screenshots that may contain sensitive data

Add to .gitignore:

*.state
agent-browser-profile/
.agent-browser/
screenshots/

Integration with Other Skills

With Parallel Research

# Research a topic, then verify claims on websites
parallel_research.py chat "Find pricing for Acme Corp"
# Then use agent-browser to verify on their actual pricing page
agent-browser open https://acme.com/pricing
agent-browser snapshot -i

With Screenshot Comparison

# Take baseline screenshots for visual regression
agent-browser open https://myapp.com
agent-browser screenshot baseline.png

# After changes, compare
agent-browser screenshot current.png
# Use image comparison tool

With Form Data from Sheets

# Load test data from Google Sheets, run form tests
import subprocess
test_data = get_sheet_data("Form Test Cases")
for row in test_data:
    subprocess.run(["agent-browser", "fill", "@email", row["email"]])
    subprocess.run(["agent-browser", "fill", "@password", row["password"]])
    subprocess.run(["agent-browser", "click", "@submit"])

Files in This Skill

references/commands.md - Full command reference
references/authentication.md - Login flow patterns
references/testing-patterns.md - E2E test workflows
references/snapshot-workflow.md - Ref system deep dive
scripts/browser_test.py - Python automation wrapper

Example: Complete Form Test

# Open the registration page
agent-browser open https://example.com/register

# Get element refs
agent-browser snapshot -i

# Fill the form (refs from snapshot output)
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "SecurePass123!"
agent-browser select @e4 "United States"
agent-browser click @e5  # Terms checkbox
agent-browser click @e6  # Submit button

# Wait for navigation and verify
agent-browser wait network
agent-browser snapshot -i

# Take confirmation screenshot
agent-browser screenshot registration-success.png

Troubleshooting

Element not found:

Re-run snapshot -i to get fresh refs
Use semantic locators: agent-browser find text "Submit"
Check if element is in an iframe

Page not loading:

Increase timeout: agent-browser open <url> --timeout 30000
Wait for network: agent-browser wait network

Session expired:

Save state before tests: agent-browser session save backup
Load state to restore: agent-browser session load backup

agent-browser

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

agent-browser

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Agent Browser Testing Skill

Quick Decision Tree

Installation

Core Concept: Ref-Based Targeting

Essential Workflow

Common Commands Quick Reference

Navigation

Snapshots

Interaction

Semantic Locators

Waiting

Session Management

Security Notes

Integration with Other Skills

With Parallel Research

With Screenshot Comparison

With Form Data from Sheets

Files in This Skill

Example: Complete Form Test

Troubleshooting

Similar Skills

Help us improve

Agent Browser Testing Skill

Quick Decision Tree

Installation

Core Concept: Ref-Based Targeting

Essential Workflow

Common Commands Quick Reference

Navigation

Snapshots

Interaction

Semantic Locators

Waiting

Session Management

Security Notes

Integration with Other Skills

With Parallel Research

With Screenshot Comparison

With Form Data from Sheets

Files in This Skill

Example: Complete Form Test

Troubleshooting