Skill

agent-browser

From claudecode-research-harness-workflow

Automates browser interactions via the agent-browser CLI: navigation, form filling, clicking, screenshotting, and UI state checking. Use AI snapshot workflow to interact with elements by reference.

testing

automation

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claudecode-research-harness-workflow:agent-browser [url] [--headless]

Not user invocable

Model invocation disabled

Forked subagent

Default effort

Argument hint[url] [--headless]

Tool Access

This skill is limited to the following tools:

BashRead

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

A skill for browser automation. Uses the agent-browser CLI to perform UI debugging, verification, and automated interaction.

Supporting Files

references/ai-snapshot-workflow.mdreferences/browser-automation.md

SKILL.md

187 lines · ~1.2k tokens

Stats

LanguageShell

Stars14

Forks17

MaintenanceExcellent

Last CommitJun 3, 2026

Actions

View Source View Plugin View on GitHub View README

Agent Browser Skill

A skill for browser automation. Uses the agent-browser CLI to perform UI debugging, verification, and automated interaction.

Trigger Phrases

This skill is automatically invoked by the following phrases:

"Open this page", "Check the URL"
"Click on", "Type into", "Fill the form"
"Take a screenshot"
"Check the UI", "Test the screen"
"open this page", "click on", "fill the form", "screenshot"

Features

Feature	Details
Browser Automation	See references/browser-automation.md
AI Snapshot Workflow	See references/ai-snapshot-workflow.md

Execution Steps

Step 0: Verify agent-browser

# Check installation
which agent-browser

# If not installed
npm install -g agent-browser
agent-browser install

Step 1: Classify the User's Request

Request Type	Action
Open a URL	`agent-browser open <url>`
Click an element	Snapshot → `agent-browser click @ref`
Fill a form	Snapshot → `agent-browser fill @ref "text"`
Check state	`agent-browser snapshot -i -c`
Take screenshot	`agent-browser screenshot <path>`
Debug	`agent-browser --headed open <url>`

Step 2: AI Snapshot Workflow (Recommended)

For most operations, first take a snapshot and then interact using element references:

# 1. Open the page
agent-browser open https://example.com

# 2. Take a snapshot (AI-optimized, interactive elements only)
agent-browser snapshot -i -c

# Sample output:
# - link "Home" [ref=e1]
# - button "Login" [ref=e2]
# - input "Email" [ref=e3]
# - input "Password" [ref=e4]
# - button "Submit" [ref=e5]

# 3. Interact via element references
agent-browser click @e2           # Click Login button
agent-browser fill @e3 "[email protected]"
agent-browser fill @e4 "password123"
agent-browser click @e5           # Submit

Step 3: Verify the Result

# Check current state via snapshot
agent-browser snapshot -i -c

# Or check the URL
agent-browser get url

# Take a screenshot
agent-browser screenshot result.png

Quick Reference

Basic Operations

Command	Description
`open <url>`	Open a URL
`snapshot -i -c`	AI-optimized snapshot
`click @e1`	Click an element
`fill @e1 "text"`	Fill a form
`type @e1 "text"`	Type text
`press Enter`	Press a key
`screenshot [path]`	Take a screenshot
`close`	Close the browser

Navigation

Command	Description
`back`	Go back
`forward`	Go forward
`reload`	Reload

Retrieving Information

Command	Description
`get text @e1`	Get text
`get html @e1`	Get HTML
`get url`	Current URL
`get title`	Page title

Waiting

Command	Description
`wait @e1`	Wait for element
`wait 1000`	Wait 1 second

Debugging

Command	Description
`--headed`	Show browser
`console`	Console logs
`errors`	Page errors
`highlight @e1`	Highlight element

Session Management

Manage multiple tabs/sessions in parallel:

# Specify a session
agent-browser --session admin open https://admin.example.com
agent-browser --session user open https://example.com

# List sessions
agent-browser session list

# Operate in a specific session
agent-browser --session admin snapshot -i -c

Choosing Between agent-browser and MCP Browser Tools

Tool	Recommendation	Use Case
agent-browser	★★★	First choice. Powerful AI-optimized snapshots
chrome-devtools MCP	★★☆	When Chrome is already open
playwright MCP	★★☆	Complex E2E testing

Principle: Try agent-browser first; use MCP tools only if it doesn't work.

Notes

agent-browser runs in headless mode by default
Use --headed to show the browser
Sessions persist until explicitly closed
Use sessions for sites requiring authentication

agent-browser

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

agent-browser

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Agent Browser Skill

Trigger Phrases

Features

Execution Steps

Step 0: Verify agent-browser

Step 1: Classify the User's Request

Step 2: AI Snapshot Workflow (Recommended)

Step 3: Verify the Result

Quick Reference

Basic Operations

Navigation

Retrieving Information

Waiting

Debugging

Session Management

Choosing Between agent-browser and MCP Browser Tools

Notes

Similar Skills

Agent Browser Skill

Trigger Phrases

Features

Execution Steps

Step 0: Verify agent-browser

Step 1: Classify the User's Request

Step 2: AI Snapshot Workflow (Recommended)

Step 3: Verify the Result

Quick Reference

Basic Operations

Navigation

Retrieving Information

Waiting

Debugging

Session Management

Choosing Between agent-browser and MCP Browser Tools

Notes

Similar Skills