Skill

agent-browser

Browser automation for web testing, form filling, screenshots, and data extraction. Ref-based workflow with best practices for reliable automation.

npx claudepluginhub jagreehal/jagreehal-claude-skills --plugin code-review

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/jagreehal-claude-skills:agent-browser

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Automate browser interactions using a ref-based workflow. Navigate, snapshot, interact, repeat.

SKILL.md

405 lines · ~3k tokens

Similar Skills

karpathy-guidelines

168.3k

Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.

andrej-karpathy-skills

skill-lookup

163.4k

Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.

prompts.chat

debugging-and-error-recovery

52.4k

Guides systematic root-cause debugging when tests fail, builds break, or unexpected errors occur. Provides a structured triage checklist to preserve evidence, localize, and fix issues instead of guessing.

agent-skills

Stats

LanguageShell

Stars2

Forks1

MaintenancePoor

Last CommitJan 15, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Browser Automation with agent-browser

Automate browser interactions using a ref-based workflow. Navigate, snapshot, interact, repeat.

Installation

npm install -g agent-browser
agent-browser install              # Download Chromium
agent-browser install --with-deps  # Linux: install system deps

Quick Start

agent-browser open <url>          # Navigate to page
agent-browser snapshot -i         # Get interactive elements with refs
agent-browser click @e1           # Click element by ref
agent-browser fill @e2 "text"     # Fill input by ref
agent-browser close               # Close browser

Core Workflow

Navigate: agent-browser open <url>
Snapshot: agent-browser snapshot -i (returns elements with refs like @e1, @e2)
Interact: Use refs from the snapshot
Re-snapshot: After navigation or significant DOM changes

Command Reference

Navigation

agent-browser open <url>          # Navigate to URL
agent-browser back                # Go back
agent-browser forward             # Go forward
agent-browser reload              # Reload page
agent-browser close               # Close browser

Snapshot (Page Analysis)

agent-browser snapshot            # Full accessibility tree
agent-browser snapshot -i         # Interactive elements only (recommended)
agent-browser snapshot -c         # Compact output
agent-browser snapshot -d 3       # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to selector (large pages)

Interactions (Use @refs)

agent-browser click @e1           # Click
agent-browser dblclick @e1        # Double-click
agent-browser fill @e2 "text"     # Clear and type
agent-browser type @e2 "text"     # Type without clearing
agent-browser press Enter         # Press key
agent-browser press Control+a     # Key combination
agent-browser hover @e1           # Hover
agent-browser check @e1           # Check checkbox
agent-browser uncheck @e1         # Uncheck checkbox
agent-browser select @e1 "value"  # Select dropdown option
agent-browser scroll down 500     # Scroll page
agent-browser scrollintoview @e1  # Scroll element into view
agent-browser drag @e1 @e2        # Drag from source to target
agent-browser upload @e1 file.pdf # Upload file to input

Get Information

agent-browser get text @e1        # Get element text
agent-browser get value @e1       # Get input value
agent-browser get html @e1        # Get element HTML
agent-browser get attr @e1 href   # Get attribute value
agent-browser get title           # Get page title
agent-browser get url             # Get current URL

State Checking (Assertions)

agent-browser is visible @e1      # Check if visible
agent-browser is enabled @e1      # Check if enabled
agent-browser is checked @e1      # Check if checked

Screenshots

agent-browser screenshot          # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full   # Full page screenshot

Wait

agent-browser wait @e1            # Wait for element
agent-browser wait 2000           # Wait milliseconds (avoid)
agent-browser wait --text "Done"  # Wait for text to appear
agent-browser wait --load networkidle  # Wait for network idle

Semantic Locators (Alternative to Refs)

agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"

Sessions (Parallel Browsers)

agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list

State Management

agent-browser state save auth.json   # Save auth/cookies
agent-browser state load auth.json   # Restore state

Debugging

agent-browser open example.com --headed  # Show browser window
agent-browser console                    # View console messages
agent-browser errors                     # View page errors

Viewport & Emulation

agent-browser set viewport 1920 1080     # Set viewport size
agent-browser set device "iPhone 14"     # Emulate device
agent-browser set media dark             # Emulate dark mode
agent-browser set geo 37.7749 -122.4194  # Set geolocation
agent-browser set offline on             # Enable offline mode
agent-browser set offline off            # Disable offline mode

Storage & Cookies

agent-browser cookies                    # List all cookies
agent-browser cookies set name value     # Set cookie
agent-browser cookies clear              # Clear all cookies
agent-browser storage local              # List localStorage
agent-browser storage local get key      # Get localStorage item
agent-browser storage local set key val  # Set localStorage item
agent-browser storage session            # List sessionStorage

Network Interception

agent-browser network requests           # List network requests
agent-browser network route "**/api/*"   # Intercept matching URLs
agent-browser network route "**/api/*" --abort        # Block requests
agent-browser network route "**/api/*" --body '{"mock":true}'  # Mock response

Headers & Auth

agent-browser open api.example.com --headers '{"Authorization": "Bearer token"}'
agent-browser set headers '{"X-Custom": "value"}'

JSON Output

agent-browser snapshot -i --json    # Machine-readable output
agent-browser get text @e1 --json   # For programmatic parsing

Best Practices

Snapshots

MUST: Re-snapshot after any navigation or DOM mutation
MUST: Use snapshot -i (interactive only) to reduce noise
MUST: Re-snapshot after clicks that trigger page changes
NEVER: Cache refs across page navigations—refs are invalidated
NEVER: Assume refs persist after form submissions or route changes

Selectors

SHOULD: Prefer @refs from snapshot over semantic locators
MUST: Fall back to semantic locators when refs are unstable (dynamic content)
NEVER: Use brittle CSS selectors or XPath directly
SHOULD: Use find role + --name for buttons/links when refs fail

Waits

MUST: Wait for networkidle after form submissions
MUST: Use wait --text "..." or wait @ref for dynamic content
NEVER: Use fixed wait 2000—flaky and slow
SHOULD: Set reasonable timeouts; fail fast on missing elements

Forms

MUST: Use fill (clears first) for inputs, not type
MUST: Verify submission with wait --text or wait --url
SHOULD: Snapshot after each step in multi-page flows
MUST: Handle validation errors—check for error text after submit

State & Auth

MUST: Save auth state after login for reuse (state save)
MUST: Load state before navigating to authenticated pages
NEVER: Commit auth state files to version control
SHOULD: Use separate state files per environment (dev/staging/prod)

Screenshots

MUST: Use screenshot --full for pages with scroll
SHOULD: Take screenshots before and after critical actions
MUST: Use explicit file paths in CI/CD pipelines

Sessions

SHOULD: Use named sessions for parallel browser testing
MUST: Close sessions explicitly when done
NEVER: Mix refs across different sessions

Error Handling

MUST: Check errors output when interactions fail silently
SHOULD: Use console to debug JavaScript issues
MUST: Re-snapshot and retry once before failing
SHOULD: Use --headed mode when debugging complex flows

Performance

MUST: Close browser when done (close)
SHOULD: Reuse sessions for multiple tests on same domain
NEVER: Open new browser for each small interaction
SHOULD: Use snapshot -c (compact) for large pages
SHOULD: Use snapshot -s "#scope" to limit snapshot to relevant DOM

Assertions

MUST: Use is visible before interacting with dynamic elements
MUST: Use is enabled before clicking buttons that may be disabled
SHOULD: Combine is visible + wait for elements that appear async

Responsive Testing

MUST: Set viewport before navigation, not after
SHOULD: Use device presets for consistent mobile testing
MUST: Re-snapshot after viewport changes—layout affects refs

Network & Mocking

SHOULD: Use network route --abort to test offline behavior
SHOULD: Use network route --body to mock API responses in tests
MUST: Set up routes before navigating to the page
NEVER: Leave network intercepts active across unrelated tests

Storage

SHOULD: Use cookies clear between test scenarios
MUST: Check storage local when debugging auth issues
SHOULD: Use storage commands to set up test preconditions

Examples

Form Submission

agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]

agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Verify result

Login with State Persistence

# First time: login and save
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json

# Later: load state and skip login
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
# Already authenticated

Multi-Step Wizard

agent-browser open https://example.com/wizard

# Step 1
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser click @e2  # Next button
agent-browser wait --text "Step 2"

# Step 2 - MUST re-snapshot after navigation
agent-browser snapshot -i
agent-browser select @e1 "Option B"
agent-browser click @e2  # Next button
agent-browser wait --text "Step 3"

# Step 3
agent-browser snapshot -i
agent-browser check @e1  # Terms checkbox
agent-browser click @e2  # Submit
agent-browser wait --text "Success"

Screenshot Comparison

agent-browser open https://example.com
agent-browser screenshot before.png --full
# ... make changes ...
agent-browser screenshot after.png --full

Responsive Testing

# Test mobile viewport
agent-browser set device "iPhone 14"
agent-browser open https://example.com
agent-browser snapshot -i
agent-browser screenshot mobile.png --full

# Test desktop
agent-browser set viewport 1920 1080
agent-browser reload
agent-browser snapshot -i  # Re-snapshot after viewport change
agent-browser screenshot desktop.png --full

API Mocking

# Set up mock before navigation
agent-browser network route "**/api/users" --body '[{"id":1,"name":"Test User"}]'
agent-browser open https://example.com/dashboard
agent-browser snapshot -i
# Page shows mocked data

# Test error handling
agent-browser network route "**/api/users" --abort
agent-browser reload
agent-browser wait --text "Failed to load"

File Upload

agent-browser open https://example.com/upload
agent-browser snapshot -i
# Output: file input [ref=e1], button "Upload" [ref=e2]

agent-browser upload @e1 /path/to/document.pdf
agent-browser click @e2
agent-browser wait --text "Upload complete"

Integration

Skill	Relationship
`testing-strategy`	E2E test patterns
`react-development`	Testing React UIs
`storybook-journeys`	Visual testing workflows

agent-browser

Popularity

Invocation

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

agent-browser

Popularity

Invocation

Context Preview

SKILL.md

Browser Automation with agent-browser

Installation

Quick Start

Core Workflow

Command Reference

Navigation

Snapshot (Page Analysis)

Interactions (Use @refs)

Get Information

State Checking (Assertions)

Screenshots

Wait

Semantic Locators (Alternative to Refs)

Sessions (Parallel Browsers)

State Management

Debugging

Viewport & Emulation

Storage & Cookies

Network Interception

Headers & Auth

JSON Output

Best Practices

Snapshots

Selectors

Waits

Forms

State & Auth

Screenshots

Sessions

Error Handling

Performance

Assertions

Responsive Testing

Network & Mocking

Storage

Examples

Form Submission

Login with State Persistence

Multi-Step Wizard

Screenshot Comparison

Responsive Testing

API Mocking

File Upload

Integration

Similar Skills

Help us improve

Browser Automation with agent-browser

Installation

Quick Start

Core Workflow

Command Reference

Navigation

Snapshot (Page Analysis)

Interactions (Use @refs)

Get Information

State Checking (Assertions)

Screenshots

Wait

Semantic Locators (Alternative to Refs)

Sessions (Parallel Browsers)

State Management

Debugging

Viewport & Emulation

Storage & Cookies

Network Interception

Headers & Auth