Skill

dev-browser

Browser automation and scraping with persistent page state. Use when users ask to navigate websites, fill forms, take screenshots, extract web data, test web apps, or automate browser workflows. Trigger phrases include "go to [url]", "click on", "fill out the form", "take a screenshot", "scrape", "automate", "test the website", "log into", or any browser interaction request.

From dev-browser

Install

Run in your terminal

npx claudepluginhub grailautomation/claude-plugins --plugin dev-browser

Tool Access

This skill uses the workspace's default tool permissions.

Supporting Assets

View in Repository

bun.lock

package-lock.json

package.json

references/aria-snapshots.md

references/client-api.md

references/scraping.md

references/server-management.md

scripts/start-relay.ts

scripts/start-server.ts

server.sh

src/client.ts

src/index.ts

src/relay.ts

src/snapshot/__tests__/snapshot.test.ts

src/snapshot/browser-script.ts

src/snapshot/index.ts

src/snapshot/inject.ts

src/types.ts

tsconfig.json

vitest.config.ts

Skill Content

Dev Browser

Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally.

Setup

Two modes available. Ask the user if unclear which to use.

Standalone Mode (Default)

Launches a new Chromium browser. Start the server and verify readiness:

cd skills/dev-browser && ./server.sh &

# Verify server is ready (poll HTTP endpoint — do NOT watch stdout)
for i in $(seq 1 30); do
  curl -s http://localhost:9222 > /dev/null 2>&1 && echo "Server ready" && break
  sleep 1
done

Add --headless to ./server.sh --headless & for headless mode.

If the server is already running from a previous session, the curl check succeeds immediately — no restart needed.

Extension Mode

Connects to the user's existing Chrome browser (preserving logged-in sessions):

cd skills/dev-browser && npm i && npm run start-extension &

Wait for Extension connected in the output. If the extension hasn't connected, tell the user to install it from: https://github.com/SawyerHood/dev-browser/releases

For detailed server lifecycle management, see Server Management.

Choosing Your Approach

Local/source-available sites: Read the source code first to write selectors directly
Unknown page layouts: Use getAISnapshot() to discover elements — see ARIA Snapshots
Visual feedback: Take screenshots to see what the user sees

Writing Scripts

Run all scripts from skills/dev-browser/ directory. The @/ import alias requires this.

Execute scripts inline using heredocs:

cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";

const client = await connect();
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });

await page.goto("https://example.com");
await waitForPageLoad(page);

console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF

Write to tmp/ files only when the script needs reuse or is complex.

Key Principles

Small scripts: Each script does ONE thing (navigate, click, fill, check)
Evaluate state: Log/return state at the end to decide next steps
Descriptive page names: Use "checkout", "login", not "main"
Always disconnect: await client.disconnect() — pages persist on server
Plain JS in evaluate: page.evaluate() runs in the browser — no TypeScript syntax

// Correct: plain JavaScript in browser context
const text = await page.evaluate(() => {
  return document.body.innerText;
});

// Wrong: TypeScript syntax breaks at runtime
const text = await page.evaluate(() => {
  const el: HTMLElement = document.body; // Type annotation fails in browser!
  return el.innerText;
});

Workflow Loop

Follow this pattern for complex tasks:

Write a script to perform one action
Run it and observe the output
Evaluate — did it work? What's the current state?
Decide — is the task complete or do we need another script?
Repeat until done

Inspecting Page State

Screenshots

await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });

ARIA Snapshots

Use getAISnapshot() to discover page elements without knowing selectors. Returns a YAML accessibility tree with [ref=eN] markers for interactive elements:

const snapshot = await client.getAISnapshot("mypage");
console.log(snapshot);

// Interact with a discovered element
const element = await client.selectSnapshotRef("mypage", "e5");
await element.click();

For the full guide on interpreting snapshots, refs, and the discover-then-interact pattern, see ARIA Snapshots.

Waiting

import { waitForPageLoad } from "@/client.js";

await waitForPageLoad(page);                   // After navigation
await page.waitForSelector(".results");        // For specific elements
await page.waitForURL("**/success");           // For specific URL

Error Recovery

Page state persists after failures. Debug with:

cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";

const client = await connect();
const page = await client.page("mypage");

await page.screenshot({ path: "tmp/debug.png" });
console.log({
  url: page.url(),
  title: await page.title(),
  bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});

await client.disconnect();
EOF

Client API Quick Reference

const client = await connect();

const page = await client.page("name");                    // Get or create page
const page = await client.page("name", { viewport: {} }); // With viewport
const pages = await client.list();                         // List all pages
await client.close("name");                                // Close a page
await client.disconnect();                                 // Disconnect (pages persist)

const snapshot = await client.getAISnapshot("name");       // Accessibility tree
const el = await client.selectSnapshotRef("name", "e5");   // Element by ref
const info = await client.getServerInfo();                 // Server mode info

The page object is a standard Playwright Page. For the full API reference with all parameters, return types, and options, see Client API.

Scraping

For large datasets (followers, posts, search results), intercept and replay network requests rather than scrolling the DOM. See Scraping Guide.

Reference Documentation

Server Management — Starting, stopping, verifying readiness, troubleshooting
Client API — Full API reference with all methods, parameters, and types
ARIA Snapshots — Element discovery, ref system, interaction patterns
Scraping Guide — Network interception, API replay, pagination

Similar Skills

skill-lookup

Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.

156.4k

prompt-lookup

Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.