Help us improve
Share bugs, ideas, or general feedback.
From browser-harness
Controls the user's already-running Chrome via CDP to automate, scrape, test, or interact with web pages. Uses screenshots and coordinate clicks by default.
How this skill is triggered — by the user, by Claude, or both
Slash command
/browser-harness:browser-harnessThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Direct browser control via CDP. You drive the user's real browser with Python helpers run through the `browser-harness` command.
Share bugs, ideas, or general feedback.
Direct browser control via CDP. You drive the user's real browser with Python helpers run through the browser-harness command.
This skill is instructions only. It assumes the browser-harness command is already on $PATH. If command -v browser-harness fails, do the one-time install in references/install.md first, then continue. Installation and browser-connection setup are a prerequisite; once browser-harness <<'PY' … PY prints page info, never run install/connection steps again as part of normal work.
browser-harness <<'PY'
new_tab("https://docs.browser-use.com")
wait_for_load()
print(page_info())
PY
browser-harness — it's on $PATH after install. No cd, no uv run.new_tab(url), not goto_url(url) — goto runs in the user's active tab and clobbers their work.capture_screenshot() to understand the page, find visible targets, and decide whether you need a click, a selector, or more navigation.capture_screenshot() → read the pixel off the image → click_at_xy(x, y) → capture_screenshot() to verify. Suppress the Playwright-habit reflex of "locate first, then click" — no getBoundingClientRect, no selector hunt. Drop to DOM only when the target has no visible geometry. Hit-testing happens in Chrome's browser process, so clicks pass through iframes / shadow DOM / cross-origin without extra work.http_get(url) + ThreadPoolExecutor. No browser needed for static pages.wait_for_load().ensure_real_tab().print(page_info()) is the simplest "is this alive?" check; screenshots are the default way to verify whether a visible action worked.js(...) for inspection/extraction when a screenshot shows coordinates are the wrong tool.cdp("Domain.method", params).After every meaningful action, re-screenshot before assuming it worked.
Use remote for parallel sub-agents (each gets an isolated browser via a distinct BU_NAME) or on a headless server. BROWSER_USE_API_KEY must be set.
browser-harness <<'PY'
start_remote_daemon("work") # clean cloud browser; profileName=/profileId= to reuse a logged-in profile
PY
BU_NAME=work browser-harness <<'PY'
new_tab("https://example.com")
print(page_info())
PY
start_remote_daemon prints a liveUrl so the user can watch. Running remote daemons bill until timeout.
If you struggle with a specific UI mechanic, read the matching file under ${CLAUDE_PLUGIN_ROOT}/interaction-skills/ before inventing an approach. Available: browser-wall, connection, cookies, cross-origin-iframes, dialogs, downloads, drag-and-drop, dropdowns, iframes, network-requests, print-as-pdf, profile-sync, screenshots, scrolling, shadow-dom, tabs, uploads, viewport.
For task-specific helper additions, edit ${CLAUDE_PLUGIN_ROOT}/agent-workspace/agent_helpers.py. Keep core helpers short.
Community per-site playbooks live in ${CLAUDE_PLUGIN_ROOT}/agent-workspace/domain-skills/<host>/ and are off by default. Set BH_DOMAIN_SKILLS=1 to enable them; when enabled and the task is site-specific, read every file in the matching <site>/ directory before inventing an approach.
Input.dispatchMouseEvent goes through iframes/shadow/cross-origin at the compositor level.interaction-skills/ only when those are the wrong tool.npx claudepluginhub browser-use/browser-harness --plugin browser-harnessControls Chrome via DevTools Protocol for navigating, clicking, typing, multi-tab management, and content extraction with auto-screenshots.
Controls a live Chrome browser via puppeteer-core for automation, testing, and performance auditing. Use for clicking, typing, screenshots, DOM/AX tree, network interception, HAR export, Lighthouse audits, and device emulation.
Interacts with a live Chrome browser session via the Chrome DevTools Protocol. Reads page content, clicks elements, executes JS, navigates, and takes screenshots. Requires explicit user approval.