Browser automation CLI for AI agents. Use when testing web UIs, filling forms, taking screenshots, visual verification, or extracting page data. Triggers on "open browser", "click button", "screenshot", "visual diff", "test the page", "scrape", "fill form".
Automates browser interactions for visual verification, form testing, data extraction, and authenticated workflows.
npx claudepluginhub saadshahd/moo.mdThis skill inherits all available tools. When active, it can use any tool Claude has access to.
VERIFY AND INTERACT. Use agent-browser to prove UI changes work, test forms, extract data. Everything is ref-based: snapshot first, act by ref, verify by diff.
| Need | Approach |
|---|---|
| Prove UI change works | Visual verification flow |
| Test form behavior | Form testing flow |
| Site requires login | Auth flow first, then test |
| Extract structured data from page | Data extraction flow |
| Compare before/after visually | Diff-based evidence |
| Explore interactive elements | snapshot -i for interactive-only tree |
The core Claude Code use case — machine-verifiable proof that a UI change worked.
open URL → wait --load networkidle → screenshot before.png
→ [make code changes, reload app]
→ reload → wait --load networkidle → screenshot after.png
→ diff screenshot -b before.png → report mismatch %
Mismatch percentage is the evidence. Zero means identical. Non-zero means visible change — expected or not.
Use after: any UI change, hope:loop wave verification, hope:verify visual checks.
Validate all form paths — happy, empty, invalid.
open URL → wait --load networkidle → snapshot -i
→ read refs from interactive elements
→ fill @ref "value" for each field → click @submit-ref
→ wait --load networkidle → snapshot → verify outcome
Test all three paths:
Re-snapshot after each submission — refs invalidate on DOM change.
Sessions persist browser state across commands — login once, test many.
agent-browser --session-name myapp open login-url
→ wait --load networkidle → snapshot -i
→ fill @username "user" → fill @password "pass" → click @login
→ wait --url "/dashboard" → snapshot → verify logged in
All subsequent commands reuse the session:
agent-browser --session myapp open /settings → snapshot -i
For CI: state save ./auth.json after login, state load ./auth.json before test runs.
Pull structured data from rendered pages.
open URL → wait --load networkidle → snapshot
→ eval --stdin to extract structured data
Simple values: get text @ref returns the text content of a single element.
Multi-line extraction for complex structures:
agent-browser eval --stdin <<'EOF'
JSON.stringify(
[...document.querySelectorAll('.product')].map(el => ({
name: el.querySelector('h2').textContent,
price: el.querySelector('.price').textContent
}))
)
EOF
Alternative to refs when you know what you're looking for by label or role:
agent-browser find text "Submit" click
agent-browser find label "Email" fill "user@example.com"
agent-browser find role "button" click
agent-browser find placeholder "Search..." fill "query"
agent-browser find testid "login-btn" click
Locator type first, then value, then action. Defaults to click if no action specified.
myapp.localhost:1355 URLs — stable across restarts, no port guessingd3k agent-browser --cdp $(d3k cdp-port)snapshot -i filters to interactive elements onlyYou MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation.