web-ctl
Browser automation for AI agents - navigate, authenticate, and interact with web pages.
Overview
web-ctl gives agents persistent, session-based browser control through a single CLI. Agents navigate pages, fill forms, click buttons, and read content - all headlessly. When login or CAPTCHAs are needed, the browser opens for the human, then goes back to headless.
Architecture
/web-ctl
│
├─→ /web-ctl:web-browse → Headless actions (goto, click, type, read, snapshot)
└─→ /web-ctl:web-auth → Human-in-the-loop auth (headed browser, polls for success)
Agent
└─ Skill("web-ctl", "run github click 'role=button[name=Post]'")
└─ SKILL.md executes: node scripts/web-ctl.js run github click "..."
└─ web-ctl.js:
1. Loads session from state dir
2. Opens Playwright launchPersistentContext(userDataDir)
3. Executes action
4. Closes context (cookies flush to disk)
5. Returns JSON result to agent
Each invocation is a single Node.js process. No daemon, no MCP server, no IPC. Session state persists via Chrome's userDataDir with AES-256-GCM encrypted storage.
Installation
# Claude Code
agentsys install web-ctl
# Dependencies are NOT auto-installed by default. Either:
# (a) run: cd $(npm root)/@agentsys/web-ctl && npm install && npx playwright install chromium
# (b) set: export WEB_CTL_AUTO_INSTALL=1 (opt in to auto-install on first use)
Commands
/web-ctl
Describe what you want to do; the web-session agent orchestrates multi-step browsing.
/web-ctl # Agent-driven browsing session
/web-ctl goto <url> # Navigate directly
/web-ctl auth <name> # Authenticate to a site
/web-ctl:web-auth
Human-in-the-loop authentication. Opens a headed browser for the user to complete login (including 2FA), then captures and encrypts the session.
/web-ctl:web-auth github --url "https://github.com/login"
/web-ctl:web-auth twitter --url "https://x.com/i/flow/login" --success-url "https://x.com/home"
/web-ctl:web-browse
Headless browser actions for navigation and interaction.
/web-ctl:web-browse github goto "https://github.com"
/web-ctl:web-browse github click "role=link[name='Settings']"
/web-ctl:web-browse github click-wait "role=button[name='Save']"
/web-ctl:web-browse github snapshot
Session Lifecycle
# 1. Create session
web-ctl session start github
# 2. Authenticate (opens headed browser, user logs in)
web-ctl session auth github --url "https://github.com/login" --success-url "https://github.com"
# 3. Browse headlessly (session cookies persist across invocations)
web-ctl run github goto "https://github.com/settings"
web-ctl run github snapshot
web-ctl run github click "role=link[name='Profile']"
# 4. End session
web-ctl session end github
Action Reference
| Action | Usage | Returns |
|---|
goto | run <s> goto <url> [--no-auth-wall-detect] [--no-content-block-detect] [--no-auto-recover] [--ensure-auth] [--wait-loaded] | { url, status, authWallDetected, checkpointCompleted, ensureAuthCompleted, waitLoaded, contentBlocked, headedFallback, warning, snapshot } |
snapshot | run <s> snapshot | { url, snapshot } |
click | run <s> click <sel> [--wait-stable] | { url, clicked, snapshot } |
click-wait | run <s> click-wait <sel> [--timeout] | { url, clicked, settled, snapshot } |
type | run <s> type <sel> <text> | { url, typed, selector, snapshot } |
read | run <s> read <sel> | { url, selector, content } |
fill | run <s> fill <sel> <value> | { url, filled, snapshot } |
wait | run <s> wait <sel> [--timeout] | { url, found, snapshot } |
evaluate | run <s> evaluate <js> | { url, result } |
screenshot | run <s> screenshot [--path] | { url, path } |
network | run <s> network [--filter] | { url, requests } |
checkpoint | run <s> checkpoint [--timeout] | { url, message } |
Macros