Captures HTTP traffic from web apps using Playwright CLI with site fingerprinting for frameworks, protections, iframes, auth, APIs, plus tracing and HAR export. For API discovery and site analysis.
From cli-anything-webnpx claudepluginhub itamarzand88/cli-anything-web --plugin cli-anything-webThis skill uses the workspace's default tool permissions.
references/api-discovery.mdreferences/framework-detection.mdreferences/playwright-cli-advanced.mdreferences/playwright-cli-commands.mdreferences/playwright-cli-sessions.mdreferences/playwright-cli-tracing.mdreferences/protection-detection.mdExecutes pre-written implementation plans: critically reviews, follows bite-sized steps exactly, runs verifications, tracks progress with checkpoints, uses git worktrees, stops on blockers.
Guides idea refinement into designs: explores context, asks questions one-by-one, proposes approaches, presents sections for approval, writes/review specs before coding.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Assess the site, then capture comprehensive HTTP traffic. This skill combines site assessment with full traffic recording in a single browser session.
NEVER use
run_in_background: truefor ANY playwright-cli command. All playwright-cli commands must run in the foreground with appropriate timeouts. Background execution causes task ID tracking failures — the command completes before you can read the output. Seereferences/playwright-cli-commands.mdfor the timeout table.
NEVER use
evalfor complex expressions.evalfails silently on ternaries, comma operators, and multi-branch logic with "not well-serializable" errors. Userun-codeinstead. Seereferences/framework-detection.mdfor details.
ESM context — no
require().run-codeuses ESM. Useawait import('fs')instead ofrequire('fs'). Seereferences/playwright-cli-commands.md.
Do NOT start unless:
npx @playwright/cli@latest --version)Default capture method: playwright-cli tracing (standard workflow below).
Optional --mitmproxy mode: If the user passed --mitmproxy flag to /cli-anything-web, use mitmproxy-capture.py instead — it provides no body truncation, real-time noise filtering, deduplication, and enhanced metadata (timestamps, cookies, body sizes). Requires pip install mitmproxy (Python 3.12+):
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py start-proxy --port 8080
npx @playwright/cli@latest open <url> --config=.playwright/cli.proxy.config.json --headed
# ... browse the site as normal (snapshot, click, fill, goto) ...
npx @playwright/cli@latest -s=<app> close
python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py stop-proxy --port 8080 -o <app>/traffic-capture/raw-traffic.json
If playwright-cli fails, fall back to chrome-devtools-mcp (see HARNESS.md Tool Hierarchy).
If the target site has a documented public REST/JSON API (e.g., Hacker News Firebase API, Dev.to API, Reddit API, Wikipedia API), browser capture is optional:
httpx or curl<app>/traffic-capture/raw-traffic.jsonThis applies when:
/api/ prefix)If unsure whether a public API exists, proceed with browser capture as normal.
Before starting, check if a previous capture session exists:
python ${CLAUDE_PLUGIN_ROOT}/scripts/capture-checkpoint.py restore <app>
If a checkpoint exists, read the guidance field and resume from the last
completed step instead of starting over. This prevents duplicate work when
sessions are interrupted.
# Create output directory
mkdir -p <app>/traffic-capture
# Clear any stale sessions
npx @playwright/cli@latest kill-all 2>/dev/null || true
npx @playwright/cli@latest -s=<app> open <url> --headed --persistent
# Note: heavy SPAs (Next.js, React) may show "TimeoutError: page._snapshotForAI" on open.
# This is non-fatal — verify with: npx @playwright/cli@latest list
#
# IMPORTANT — "Browser opened with pid..." in command output means the daemon
# RE-ATTACHED to the existing browser, NOT that a new session was created.
# Do NOT re-navigate or restart when you see this. The session is still open.
If
--mitmproxymode: Replace theopencommand above with:python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py start-proxy --port 8080 npx @playwright/cli@latest -s=<app> open <url> --config=.playwright/cli.proxy.config.json --headedThis starts the proxy first, then opens the browser routed through it. All subsequent
snapshot,click,fill,gotocommands work exactly the same.
Do NOT ask the user to log in yet — Step 2 will determine if auth is needed.
Run the all-in-one site fingerprint command instead of individual eval calls. This is faster, more reliable, and detects framework + protection + iframes + auth requirements in one shot.
Use the script file — multi-line JS with arrow functions and optional chaining fails in playwright-cli's single-line command parser. The script file approach has been tested and works reliably:
npx @playwright/cli@latest -s=<app> run-code "$(grep -v '^\s*//' ${CLAUDE_PLUGIN_ROOT}/scripts/site-fingerprint.js | tr '\n' ' ')"
IMPORTANT: The
site-fingerprint.jsscript must be loaded via the command above. Do NOT copy-paste the JS inline — it will fail with SyntaxError. Thegrep -vstrips comments andtrjoins lines for single-line execution.
### Interpret fingerprint results
**Framework:**
- `googleBatch: true` → Google batchexecute RPC protocol. Generate `rpc/` subpackage.
- `nextPages: true` → Next.js Pages Router. Extract `__NEXT_DATA__` + trace `/_next/data/` fetches.
- `nextApp: true` → Next.js App Router. Trace client navigations for RSC payloads.
- `nuxt: true` → Nuxt. Extract `__NUXT__` + trace API calls.
- No framework flags → likely SSR HTML or custom SPA. Check for REST API in probe.
**Protection:**
- `cloudflare: true` → Use `curl_cffi` with `impersonate='chrome'` in generated CLI.
- `awsWaf: true` → Need WAF token cookie via browser. Use curl_cffi for API calls.
- `captcha: true` → Add pause-and-prompt to auth flow.
- `serviceWorker: true` → Site has an active Service Worker that may intercept requests
and hide them from traces. Note in assessment.md. Generated CLI's auth.py should use
`service_workers="block"` in browser context. See `references/protection-detection.md`.
**Iframes:**
- `iframeCount > 0` → App is iframe-embedded. **Re-run detection inside the iframe:**
```bash
npx @playwright/cli@latest -s=<app> run-code "async page => {
const frame = page.frames()[1];
if (!frame) return { error: 'no iframe found' };
return await frame.evaluate(() => ({
framework: {
nextPages: !!document.getElementById('__NEXT_DATA__'),
googleBatch: typeof WIZ_global_data !== 'undefined',
spaRoot: document.querySelector('#app, #root')?.id || null,
vite: !!document.querySelector('script[type=\"module\"][src*=\"/@vite\"]') || !!document.querySelector('script[type=\"module\"][src*=\"/src/\"]')
},
title: document.title,
bodyPreview: document.body?.textContent?.substring(0, 300) || ''
}));
}"
Common iframe pattern: Google Labs apps (Stitch, MusicFX, ImageFX) embed a
Vite/React SPA in an iframe. Parent has WIZ_global_data, iframe has the real app.
See references/playwright-cli-advanced.md for iframe interaction patterns.
Note: snapshot and click <ref> auto-resolve iframes. Only use run-code
for iframe interaction when built-in commands fail.
Check the fingerprint auth fields:
| Condition | Meaning | Action |
|---|---|---|
hasLoginButton && !hasUserMenu | Login required, not logged in | Ask user to log in NOW |
hasUserMenu | Already logged in | Proceed to capture |
!hasLoginButton && !hasUserMenu | No auth needed (public site) | Skip auth, proceed |
If auth is needed:
npx @playwright/cli@latest -s=<app> state-save <app>/traffic-capture/<app>-auth.json
If NO auth is needed: Skip directly to Step 2b.
Based on fingerprint results AND what you see in the UI, classify the site:
| Profile | Auth? | Operations | Exploration Focus |
|---|---|---|---|
| Auth + CRUD | Yes | Create, Read, Update, Delete | Full CRUD per resource |
| Auth + Generation | Yes | Generate, Poll, Download | Generation lifecycle + projects |
| Auth + Read-only | Yes | Read, Search, Export | Read operations + auth flow |
| No-auth + CRUD | No/Optional | Full CRUD | Skip auth, full CRUD |
| No-auth + Read-only | No | Read, Search | Minimal capture |
Start a SHORT trace, click 3-4 internal links, stop. This reveals hidden API endpoints that SSR hides on initial page load.
npx @playwright/cli@latest -s=<app> tracing-start
npx @playwright/cli@latest -s=<app> click <internal-link-1>
npx @playwright/cli@latest -s=<app> click <internal-link-2>
npx @playwright/cli@latest -s=<app> click <internal-link-3>
npx @playwright/cli@latest -s=<app> tracing-stop
# Quick parse to see what endpoints appeared
python ${CLAUDE_PLUGIN_ROOT}/scripts/parse-trace.py .playwright-cli/traces/ --latest --output /tmp/probe.json
Check the probe results — what API patterns did you find?
See references/api-discovery.md for the priority chain and decision tree.
Create <app>/traffic-capture/assessment.md to consolidate all findings:
# Site Assessment: <app>
- **URL**: <url>
- **Framework**: <detected framework or "none/custom">
- **Protocol**: <REST / GraphQL / batchexecute / HTML scraping / hybrid>
- **Protection**: <none / cloudflare / captcha / aws-waf / etc.>
- **Auth required**: <yes (type: Google SSO / cookie / JWT / API key) / no>
- **Iframes**: <yes (N frames, app in frame N at <url>) / no>
- **Site profile**: <Auth+CRUD / Auth+Generation / Auth+Read-only / No-auth+CRUD / No-auth+Read-only>
- **Capture strategy**: <API-first / SSR+API hybrid / batchexecute / HTML scraping / protected-manual>
- **Key observations**: <any quirks, localized UI, rate limits, special patterns>
Now do the comprehensive capture based on what Step 2 revealed.
# Optional: Start HAR recording alongside trace for standard-format capture
# HAR files enable mitmproxy2swagger (auto OpenAPI spec) and third-party analysis tools
npx @playwright/cli@latest -s=<app> run-code "async page => {
await page.context().routeFromHAR('<app>/traffic-capture/capture.har', {
update: true,
updateContent: 'embed',
updateMode: 'full'
});
return 'HAR recording started';
}"
# Start fresh trace for full capture (note the trace ID from output!)
npx @playwright/cli@latest -s=<app> tracing-start
# Output: "trace-<ID>" — record this ID
If
--mitmproxymode: Skiptracing-startand HAR recording above. mitmproxy is already capturing all traffic since Step 1 — just proceed to the exploration below. Every click, navigation, and form submission is automatically recorded by the proxy.
HAR recording is optional but recommended. It produces a standard HAR file alongside the trace. This enables
mitmproxy2swaggerto auto-generate an OpenAPI spec:pip install mitmproxy2swagger && mitmproxy2swagger -i capture.har -o api-spec.yaml -p <base-url>The HAR file is saved when the browser context is closed (Step 5).
Use the profile-specific checklist from Step 2b:
Auth + CRUD:
For EACH resource visible in the UI:
- [ ] List/browse: navigate to list view
- [ ] Detail: open one item
- [ ] Create: fill form, submit (capture POST body!)
- [ ] Update: edit an item, save
- [ ] Delete: delete a test item
- [ ] Settings/profile: check app settings
- [ ] Export: if available, trigger export/download
Auth + Generation:
- [ ] Dashboard/projects: navigate to project list
- [ ] Open existing project: view editor/canvas
- [ ] Generate new content: type prompt, click generate, WAIT for completion
- [ ] Edit/iterate: modify generation, re-generate
- [ ] Export/download: trigger download of generated content
- [ ] Delete: delete a test project
- [ ] Settings: check model selection, preferences
Auth + Read-only:
- [ ] Main view: navigate to primary content
- [ ] Search/filter: use search functionality
- [ ] Detail pages: open 2-3 different items
- [ ] Pagination: go to page 2 if available
- [ ] Export: if available
No-auth + CRUD:
Same as Auth + CRUD, but skip auth-related captures.
No-auth + Read-only:
- [ ] Homepage: capture initial data
- [ ] Search: try 2-3 different queries
- [ ] Detail pages: open 2-3 items
- [ ] Filters: apply different filters
- [ ] Pagination: check next page
snapshot → note ref → click <ref>snapshot + click <ref> auto-resolves iframesnpx @playwright/cli@latest -s=<app> run-code "async page => {
await page.waitForTimeout(15000);
return 'waited';
}"
Exception for read-only sites: If the site is genuinely read-only, the trace
may contain only GET requests. Note "read-only site" in assessment.md and proceed.
npx @playwright/cli@latest -s=<app> tracing-stop
If tracing-stop fails:
references/playwright-cli-tracing.md for recovery.python ${CLAUDE_PLUGIN_ROOT}/scripts/parse-trace.py \
.playwright-cli/traces/ --latest \
--output <app>/traffic-capture/raw-traffic.json
# parse-trace.py now auto-runs analyze-traffic.py and produces:
# - <app>/traffic-capture/raw-traffic.json (raw request/response data)
# - <app>/traffic-capture/traffic-analysis.json (auto-detected protocol, auth, endpoints)
#
# The analysis output shows: protocol type, auth pattern, endpoint groups,
# GraphQL operations, batchexecute RPC IDs, and suggested CLI commands.
# Review the analysis — anything marked "unknown" needs manual investigation.
# You can also run the analyzer separately for more detail:
python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \
<app>/traffic-capture/raw-traffic.json --summary
If
--mitmproxymode: Replace everything above with:# Stop the proxy and save captured traffic (includes auto-analysis) python ${CLAUDE_PLUGIN_ROOT}/scripts/mitmproxy-capture.py stop-proxy \ --port 8080 -o <app>/traffic-capture/raw-traffic.json # The stop-proxy command writes raw-traffic.json directly. # Then run the analyzer for the full report: python ${CLAUDE_PLUGIN_ROOT}/scripts/analyze-traffic.py \ <app>/traffic-capture/raw-traffic.json --summaryNo
tracing-stoporparse-trace.pyneeded — mitmproxy already has the data. The analysis will include enhanced fields (request_sequence, session_lifecycle, endpoint_sizes) that are only available with mitmproxy capture.
npx @playwright/cli@latest -s=<app> close
# Mark capture complete
python ${CLAUDE_PLUGIN_ROOT}/scripts/capture-checkpoint.py update <app> --step complete
Don't grep JS bundles. Start a new trace → screenshot → click the button → fill → submit → stop → parse. The browser IS the API documentation.
Fallback: If playwright-cli is not available, see HARNESS.md Tool Hierarchy for chrome-devtools-mcp fallback instructions.
When capture is complete (raw-traffic.json has WRITE operations, or the site is
read-only with only GET requests), invoke methodology to analyze the traffic
and build the CLI.
See references/ for: command syntax (playwright-cli-commands.md), tracing (playwright-cli-tracing.md),
sessions (playwright-cli-sessions.md), advanced patterns (playwright-cli-advanced.md),
framework detection (framework-detection.md), protection (protection-detection.md),
API discovery (api-discovery.md).