From superpowers-chrome
Controls Chrome via DevTools Protocol for navigating, clicking, typing, multi-tab management, and content extraction with auto-screenshots.
How this skill is triggered — by the user, by Claude, or both
Slash command
/superpowers-chrome:browsingThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Control Chrome via DevTools Protocol using the `use_browser` MCP tool. Single unified interface with auto-starting Chrome.
COMMANDLINE-USAGE.mdEXAMPLES.mdREADME.mdchrome-wschrome-ws-lib.jshost-override.jslib/browser-bridge.jslib/browser-session.jslib/capture.jslib/cdp-router.jslib/cdp-utils.jslib/chrome-launcher-helpers.jslib/chrome-process.jslib/console-logging.jslib/cookies.jslib/dialogs-render.jslib/dialogs-router.jslib/dialogs.jslib/element-selector.jslib/evaluation.jsControl Chrome via DevTools Protocol using the use_browser MCP tool. Single unified interface with auto-starting Chrome.
Announce: "I'm using the browsing skill to control Chrome."
Use this when:
Use Playwright MCP when:
Every DOM action (navigate, click, type, select, eval, keyboard_press, hover, drag_drop, double_click, right_click, file_upload) automatically saves:
{prefix}.png — viewport screenshot{prefix}.md — page content as structured markdown{prefix}.html — full rendered DOM{prefix}-console.txt — browser console messagesFiles are saved to the session directory with sequential prefixes (001-navigate, 002-click, etc.). You must check these before using extract or screenshot actions.
Single MCP tool with action-based interface. Chrome auto-starts on first use.
Parameters:
action (required): Operation to performselector (optional): CSS or XPath selector for element operationspayload (optional): Action-specific data (string or object)timeout (optional): Timeout in ms for await operations (default: 5000)Active tab: Every action operates on the current activeTab. Use switch_tab to change it.
navigate: Navigate to URL
payload: URL string{action: "navigate", payload: "https://example.com"}await_element: Wait for element to appear
selector: CSS selectortimeout: Max wait time in ms{action: "await_element", selector: ".loaded", timeout: 10000}await_text: Wait for text to appear
payload: Text to wait for{action: "await_text", payload: "Welcome"}click: Click element
selector: CSS selector{action: "click", selector: "button.submit"}type: Text input
selector: Optional — clicks to focus firstpayload: Text to type (\t=Tab, \n=Enter){action: "type", selector: "#email", payload: "user@example.com"}double_click: Double-click element (fires dblclick event)
selector: CSS selector{action: "double_click", selector: ".item"}right_click: Right-click element (fires contextmenu event)
selector: CSS selector{action: "right_click", selector: ".row"}select: Select dropdown option
selector: CSS selectorpayload: Option value(s){action: "select", selector: "select[name=state]", payload: "CA"}keyboard_press: Press special keys (Tab, Enter, Escape, Arrow keys, F1-F12)
payload: Key name (string) or {"key": "Tab", "modifiers": {"shift": true, "ctrl": false, "alt": false, "meta": false}}{action: "keyboard_press", payload: "Tab"}{action: "keyboard_press", payload: {"key": "Tab", "modifiers": {"shift": true}}}These use CDP Input.dispatchMouseEvent, bypassing synthetic event restrictions.
hover: Move mouse over element (CSS :hover, tooltips, menus)
selector: CSS selector{action: "hover", selector: ".menu-trigger"}drag_drop: Drag element to target (native drag-and-drop via CDP)
selector: Source elementpayload: Target selector or JSON coordinates {"x":N,"y":N}{action: "drag_drop", selector: ".card", payload: ".column-2"}mouse_move: Move mouse to coordinates
payload: JSON {"x":N,"y":N} (optional: steps, fromX, fromY for smooth movement){action: "mouse_move", payload: "{\"x\":100,\"y\":200}"}scroll: Scroll via mouse wheel events
payload: Direction (up/down/left/right) or JSON {"deltaX":N,"deltaY":N}selector: Optional — scroll within element{action: "scroll", payload: "down"}selector: File input elementpayload: File path or JSON {"files":["/path/a.pdf","/path/b.jpg"]}{action: "file_upload", selector: "#upload", payload: "/tmp/doc.pdf"}extract: Get page content
payload: Format ('markdown'|'text'|'html')selector: Optional - limit to element{action: "extract", payload: "markdown"}{action: "extract", payload: "text", selector: "h1"}attr: Get element attribute
selector: CSS selectorpayload: Attribute name{action: "attr", selector: "a.download", payload: "href"}eval: Execute JavaScript
payload: JavaScript code{action: "eval", payload: "document.title"}payload: Filenameselector: Optional - screenshot specific element{action: "screenshot", payload: "/tmp/chart.png", selector: ".chart"}list_tabs: List all open tabs
{action: "list_tabs"}new_tab: Create new tab
{action: "new_tab"}close_tab: Close the active tab
{action: "close_tab"}switch_tab: Switch the active tab (sticky — stays until changed)
payload: Tab index (number), URL substring, or title substring{action: "switch_tab", payload: 1} (by index){action: "switch_tab", payload: "example.com"} (by URL substring){action: "switch_tab", payload: "GitHub"} (by title substring)show_browser: Make browser window visible (headed mode)
{action: "show_browser"}hide_browser: Switch to headless mode (invisible browser)
{action: "hide_browser"}browser_mode: Check current browser mode, port, and profile
{action: "browser_mode"}{"headless": true|false, "mode": "headless"|"headed", "running": true|false, "port": 9222, "profile": "name", "profileDir": "/path"}set_profile: Change Chrome profile (must kill Chrome first)
{action: "set_profile", "payload": "browser-user"}get_profile: Get current profile name and directory
{action: "get_profile"}{"profile": "name", "profileDir": "/path"}Default behavior: Chrome starts in headless mode with "superpowers-chrome" profile on a dynamically allocated port (range 9222-12111). Override the port with CHROME_WS_PORT; override the profile with CHROME_WS_PROFILE.
Auto-disambiguation across parallel MCPs:
When two MCP servers start on the same host with the default profile, the first claims superpowers-chrome (port 9222) and later ones silently fall through to superpowers-chrome-2 (port 9223), superpowers-chrome-3, etc. Each MCP drives its own Chrome with its own profile dir; they don't fight over activeTab. The bridge tracks ownership via a lock file at ~/.cache/superpowers/browser-profiles/<profile>.mcp.lock; stale locks (dead PIDs) are reclaimed automatically.
To opt out of disambiguation — e.g., to intentionally share Chrome between a long-lived chrome-ws CLI session and your MCP — set the profile name explicitly:
CHROME_WS_PROFILE=my-profile{action: "set_profile", payload: "my-profile"} at runtimeAn explicit profile name still acquires the lock, but on conflict the bridge shares rather than disambiguates — the second process reconnects to the first's Chrome (the original reconnect-on-restart behavior).
kill_chrome: Kill the Chrome process this MCP is driving
{action: "kill_chrome"}restart_chrome: kill_chrome + immediate spawn
{action: "restart_chrome"}Auto-restart banner: when the bridge has to spawn a fresh Chrome (because the previous one died or was killed externally — e.g., kill -9 <pid> from the shell), the first response after the restart prepends:
[Chrome auto-restarted; URL reset to about:blank. Re-navigate to continue.]
Treat this as a signal that your prior URL / tab state is gone — re-navigate before assuming anything about the current page.
Capture browser console output for the active tab. Buffer is keyed by the page session's sessionId, so it survives close_tab/new_tab ordering quirks. Levels: log, info, warn, error.
enable_console_logging: Start capturing
{action: "enable_console_logging"}get_console_messages: Read captured messages
{action: "get_console_messages"}{action: "get_console_messages", payload: {since: 1716000000000}}{timestamp, level, text} entriesclear_console_messages: Reset the buffer
{action: "clear_console_messages"}Native dialogs (JS alert/confirm/prompt, beforeunload, HTTP basic-auth, permission prompts, device choosers) pause the page. While a dialog is open, page-targeted actions (extract, click, eval, etc.) return a refusal whose text contains Page is behind a dialog and lists the available dialog::* selectors.
When a dialog fires during a navigate (typical for HTTP basic-auth), navigate itself throws with the dialog grammar in the message — you don't have to issue a separate page-targeted call to discover the dialog.
Handle dialogs by clicking/typing a dialog::* selector:
{action: "click", selector: "dialog::accept"} — accept JS alert/confirm/prompt, beforeunload, permission grant{action: "click", selector: "dialog::dismiss"} — dismiss / cancel / deny{action: "type", selector: "dialog::prompt", payload: "text"} then accept — respond to JS prompt{action: "type", selector: "dialog::username", payload: "alice"} + {action: "type", selector: "dialog::password", payload: "secret"} + {action: "click", selector: "dialog::accept"} — HTTP basic-auth{action: "click", selector: "dialog::device[id=\"<deviceId>\"]"} — pick a WebUSB/Bluetooth/Serial/HID deviceCritical caveats when toggling modes:
When to use headed mode:
When to stay in headless mode (default):
Profile management: Profiles store persistent browser data (cookies, localStorage, extensions, auth sessions).
Profile locations:
~/Library/Caches/superpowers/browser-profiles/{name}/~/.cache/superpowers/browser-profiles/{name}/%LOCALAPPDATA%/superpowers/browser-profiles/{name}/When to use separate profiles:
Profile data persists across:
To use a different profile:
await chromeLib.killChrome(){action: "set_profile", "payload": "my-profile"}Navigate and extract:
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "h1"}
{action: "extract", payload: "text", selector: "h1"}
{action: "navigate", payload: "https://example.com/login"}
{action: "await_element", selector: "input[name=email]"}
{action: "type", selector: "input[name=email]", payload: "user@example.com"}
{action: "type", selector: "input[name=password]", payload: "pass123"}
{action: "keyboard_press", payload: "Enter"}
{action: "await_text", payload: "Welcome"}
Uses keyboard_press to submit the form.
{action: "list_tabs"}
{action: "switch_tab", payload: 2}
{action: "click", selector: "a.email"}
{action: "await_element", selector: ".content"}
{action: "extract", payload: "text", selector: ".amount"}
{action: "navigate", payload: "https://example.com"}
{action: "type", selector: "input[name=q]", payload: "query"}
{action: "click", selector: "button.search"}
{action: "await_element", selector: ".results"}
{action: "extract", payload: "text", selector: ".result-title"}
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "a.download"}
{action: "attr", selector: "a.download", payload: "href"}
{action: "eval", payload: "document.querySelectorAll('a').length"}
{action: "eval", payload: "Array.from(document.querySelectorAll('a')).map(a => a.href)"}
Use eval to resize the browser window for testing responsive layouts:
{action: "eval", payload: "window.resizeTo(375, 812); 'Resized to mobile'"}
{action: "eval", payload: "window.resizeTo(768, 1024); 'Resized to tablet'"}
{action: "eval", payload: "window.resizeTo(1920, 1080); 'Resized to desktop'"}
Note: This resizes the window, not device emulation. It won't change:
For most responsive testing, window resize is sufficient.
Use eval to clear cookies accessible to JavaScript:
{action: "eval", payload: "document.cookie.split(';').forEach(c => { document.cookie = c.trim().split('=')[0] + '=;expires=Thu, 01 Jan 1970 00:00:00 GMT;path=/'; }); 'Cookies cleared'"}
Note: This clears cookies accessible to JavaScript. It won't clear:
For most logout/reset scenarios, this is sufficient.
{action: "scroll", payload: "down"}
{action: "scroll", payload: "up"}
{action: "scroll", selector: ".container", payload: "{\"deltaX\":0,\"deltaY\":500}"}
Uses real mouse wheel events (vs eval + scrollTo which bot detectors flag).
Always wait before interaction: Don't click or fill immediately after navigate - pages need time to load.
// BAD - might fail if page slow
{action: "navigate", payload: "https://example.com"}
{action: "click", selector: "button"} // May fail!
// GOOD - wait first
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "button"}
{action: "click", selector: "button"}
Use specific selectors: Avoid generic selectors that match multiple elements.
// BAD - matches first button
{action: "click", selector: "button"}
// GOOD - specific
{action: "click", selector: "button[type=submit]"}
{action: "click", selector: "#login-button"}
Submit forms:
Use keyboard_press with Enter after type, or append \n to the payload.
{action: "type", selector: "#search", payload: "query"}
{action: "keyboard_press", payload: "Enter"}
Check content first: Extract page content to verify selectors before building workflow.
{action: "extract", payload: "html"}
Element not found:
await_element before interactionextract action using 'html' formatTimeout errors:
{timeout: 30000} for slow pagesWrong tab active:
list_tabs to see all open tabsswitch_tab with a URL or title substring to reliably switch tabseval returns [object Object]:
JSON.stringify() for complex objects: {action: "eval", payload: "JSON.stringify({name: 'test'})"}{action: "eval", payload: "JSON.stringify(await yourAsyncFunction())"}When building test automation, you have two approaches:
Best for: Single-step tests, direct Claude control during conversation
{"action": "navigate", "payload": "https://app.com"}
{"action": "click", "selector": "#test-button"}
{"action": "eval", "payload": "JSON.stringify({passed: document.querySelector('.success') !== null})"}
Best for: Multi-step test suites, standalone automation scripts
Key insight: chrome-ws is the reference implementation showing proper Chrome DevTools Protocol usage. When use_browser doesn't work as expected, examine how chrome-ws handles the same operation.
# Example: Automated form testing
./chrome-ws navigate 0 "https://app.com/form"
./chrome-ws fill 0 "#email" "test@example.com"
./chrome-ws click 0 "button[type=submit]"
./chrome-ws wait-text 0 "Success"
For command-line usage outside Claude Code, see COMMANDLINE-USAGE.md.
For detailed examples, see EXAMPLES.md.
Full CDP documentation: https://chromedevtools.github.io/devtools-protocol/
npx claudepluginhub obra/superpowers-chrome --plugin superpowers-chromeControls local Chrome/Chromium via CDP for signed-in profiles, anonymous sessions, screenshots, console logs, network capture, form filling, uploads, downloads, and PDF export.
Automates browser interactions via Chrome DevTools Protocol. Screenshots, clicks, types, navigates, reads page accessibility trees, extracts text, and executes JavaScript in web pages. Use when the user asks to interact with a website, test a web app, fill web forms, scrape web content, or automate browser tasks.
Interacts with a live Chrome browser session via the Chrome DevTools Protocol. Reads page content, clicks elements, executes JS, navigates, and takes screenshots. Requires explicit user approval.