From ultrapowers-dev
Use when an AI agent needs to interact with websites — navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, or automating any browser task
How this skill is triggered — by the user, by Claude, or both
Slash command
/ultrapowers-dev:browser-automationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Give AI agents the ability to see and interact with the web. Choose the right tool based on your use case — quick tasks, testing, or programmatic automation.
Give AI agents the ability to see and interact with the web. Choose the right tool based on your use case — quick tasks, testing, or programmatic automation.
| Tool | Best for | Install | Language |
|---|---|---|---|
| agent-browser (Vercel) | Agent tasks, form filling, scraping, screenshots | npm i -g agent-browser | CLI (any) |
| browser-use | Python agent workflows, complex multi-step automation | pip install browser-use | Python |
| Playwright | E2E testing, CI/CD, deterministic test suites | npm i -D @playwright/test | JS/Python |
Need to test UI in CI? → Playwright (see e2e-testing skill)
Need agent to browse/interact live? → agent-browser or browser-use
Python project? → browser-use
Any language / CLI? → agent-browser
Every browser automation follows the same pattern:
agent-browser open https://example.com
agent-browser snapshot -i # Get element refs (@e1, @e2...)
agent-browser fill @e1 "text" # Fill input
agent-browser click @e3 # Click button
agent-browser screenshot page.png # Capture
Auth: Import from user's browser with agent-browser --auto-connect state save ./auth.json
Official skill: Install agent-browser from vercel-labs/agent-browser for full documentation.
browser-use open https://example.com
browser-use state # Get clickable elements with indices
browser-use click 5 # Click element by index
browser-use input 3 "text" # Fill input by index
browser-use screenshot page.png # Capture
Auth: Use --browser real --profile "Default" to access user's Chrome sessions.
Official skill: Install browser-use from browser-use/browser-use for full documentation.
| Use case | Recommended tool | Notes |
|---|---|---|
| Fill out a web form | agent-browser | Snapshot → fill → click → verify |
| Scrape data from pages | agent-browser | Navigate → snapshot → extract text |
| E2E UI test suite | Playwright | Deterministic, CI-friendly (see e2e-testing) |
| Login and perform task | agent-browser/browser-use | Import auth state, then automate |
| Monitor a dashboard | browser-use | Persistent sessions, screenshot on interval |
| Test across browsers | Playwright | Chromium, Firefox, WebKit |
| Mistake | Fix |
|---|---|
| Interacting before page loads | Always wait for network idle or specific element |
| Stale element references | Re-snapshot after any navigation or DOM change |
| Hardcoding selectors | Use tool's element indexing (refs/indices) over CSS selectors |
| No error recovery | Check state after each action, retry if needed |
| Running headed in CI | Use headless mode in pipelines |
Original — Datatide, MIT licensed. References vercel-labs/agent-browser (Apache-2.0) and browser-use/browser-use (MIT).
npx claudepluginhub ennio-datatide/ultrapowers-devAutomates headless browser tasks with Vercel's agent-browser CLI: navigate URLs, snapshot interactive elements with refs (@e1), click/fill/type, scroll, test web pages.
Automates headless browser tasks via Vercel's agent-browser CLI: navigate URLs, snapshot interactive elements with refs, click/fill forms, scrape data using Bash commands.
Automates browser interactions via the agent-browser CLI: navigation, form filling, clicking, screenshotting, and UI state checking. Use AI snapshot workflow to interact with elements by reference.