Help us improve
Share bugs, ideas, or general feedback.
From playbooks-virtuoso
Captures web page screenshots with a tool cascade (browser MCP → shot-scraper → Playwright → user install). Handles full-page, element, viewport, retina, dark mode, and batched captures.
npx claudepluginhub krzysztofsurdy/code-virtuoso --plugin agents-virtuosoHow this skill is triggered — by the user, by Claude, or both
Slash command
/playbooks-virtuoso:web-screenshot <url-or-batch-yaml> [--full-page] [--selector CSS] [--output PATH] [--width N] [--height N] [--retina] [--dark]<url-or-batch-yaml> [--full-page] [--selector CSS] [--output PATH] [--width N] [--height N] [--retina] [--dark]The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Take reliable screenshots of web pages. This skill teaches a **tool-selection cascade** - pick the lightest tool that does the job, never install dependencies the environment did not ask for.
Generates marketing-quality screenshots of your app using Playwright at HiDPI resolution. Use for Product Hunt, social media, landing pages, or documentation.
Captures clean screenshots of web pages via Playwright MCP for blog posts and documentation. Handles viewport resizing, full-page vs element shots, cookie banners, anchor scrolling, and PNG/JPG output.
CLI for browser automation: navigate sites, snapshot elements for refs, fill forms, click buttons, screenshot, scrape data, test web apps. Chains commands, imports auth state.
Share bugs, ideas, or general feedback.
Take reliable screenshots of web pages. This skill teaches a tool-selection cascade - pick the lightest tool that does the job, never install dependencies the environment did not ask for.
Two cascades govern the skill: one for picking the capture tool, one for escalating around a bot wall. They are different decisions - do not mix them up.
Before any capture, check what is already available in the environment and pick the lightest viable tool:
| # | Tool | Detect with | Best for | Limits |
|---|---|---|---|---|
| 1 | Browser MCP (e.g. mcp__*__browser_take_screenshot) | Already loaded in this session - check the MCP tool list | Single ad-hoc shot, no install, full-page and element shots, navigation flows | Runs in docker with datacenter IP - bot walls block this; output path is inside the MCP container |
| 2 | shot-scraper (CLI) | which shot-scraper | Repeated captures, batch YAML, scripted automation, CI jobs | Headless by default - some bot walls fire only on headless; requires pip install shot-scraper && shot-scraper install once |
| 3 | Playwright (Python or Node) | python -c "import playwright" 2>/dev/null | Multi-step flows, headed mode with stealth, anti-bot escalation (see Cascade B Tier 1) | Library-level; more code per shot |
| 4 | Ask user before installing | None of the above present | Long-term tooling decisions | Never install silently |
When Cascade A's pick gets blocked (status 403/429/503 or a CAPTCHA/block page), do not retry the same tool with new flags - escalate up the tiers in order. Every tier aims at LIVE content. Archived snapshots are not acceptable as a fallback - they give stale content with a third-party chrome on top.
| # | Tier | What it does | When to pick it |
|---|---|---|---|
| 1 | Playwright headed + stealth | Local headed Chromium, stealth-patched, user's home IP | First escalation; user can install Playwright |
| 2 | CDP attach to user's Chrome | Use the user's actual running Chrome via --remote-debugging-port | User can relaunch Chrome with a debug flag; reuses their real session |
| 3 | System-native browser + OS screen capture | Open URL in the user's default browser, capture via the OS's built-in screenshot command | Last resort - no extra install, but brittle, steals focus, depends on OS permissions |
Critical rule: never pip install or brew install without asking first. Same rule for installing browser binaries (playwright install chromium, shot-scraper install). If a tool is missing, surface what is missing and which install fixes it; let the user decide.
shot-scraper is the preferred install when long-term capture tooling is needed - it is a Playwright-backed CLI that wraps the common cases with sensible defaults. Playwright itself is the preferred install when bot-wall escalation is on the table.
| Principle | Meaning |
|---|---|
| Default to shot-scraper | One install, one command per shot. Falls through to Playwright only when shot-scraper hits a wall. |
| Wait on something real | Wait for an element, a function, or a fixed delay - never networkidle. Playwright's own docs discourage it. |
| Element beats full page | A targeted selector shot is smaller, stabler, and faster to diff than a stitched full-page capture. Use full-page only when layout is the subject. |
| Disable animations | CSS animations are the #1 source of flake. Always pass options that freeze them for any shot that will be compared or reused. |
| Match the device | Mobile vs desktop produces wildly different layouts. Set viewport width/height explicitly when the user cares about a specific form factor. |
| Output is a file path | Every shot lands at a known absolute path the caller can hand to the next step (framing, OCR, diffing). Never leave output to chance. |
Use the MCP browser tools directly. No pip, no Chromium download. Names vary by MCP host - look for tools matching *browser_navigate and *browser_take_screenshot.
1. Call browser_navigate with url=https://example.com
2. (Optional) browser_resize for a specific viewport
3. (Optional) browser_wait_for to gate on a visible text
4. Call browser_take_screenshot with type=png, fullPage=true|false
5. ALWAYS: copy the resulting file out of the MCP container to the host.
The MCP host returns the image inline AND saves it to a path like ../tmp/playwright-output/page-<timestamp>.png - but that path is inside the docker container, not the host filesystem. Without the copy step, the user has no file to open, attach, edit, or version-control. The inline image alone is not the deliverable.
Use the bundled helper:
./scripts/copy_mcp_capture_to_host.sh \
/tmp/playwright-output/page-<timestamp>.png \
~/Desktop/screenshots/<descriptive-name>.png
The script auto-detects the running MCP browser container by the docker-mcp-name=playwright label, normalises the ../tmp/... paths the MCP returns, and runs docker cp to land the file on the host. It prints the final absolute host path on success - that path is the deliverable the user opens.
shot-scraper https://example.com -o example.png
Defaults: viewport-only (1280x780), waits for load, saves PNG. That's it.
Ask first, install second:
# After explicit user OK:
pip install shot-scraper
shot-scraper install # downloads Chromium (~150 MB)
shot-scraper --version # verify
Skip the shot-scraper install step and every capture fails with a browser-missing error.
Bot-detection vendors (Cloudflare, Akamai, PerimeterX, DataDome, and many in-house systems) serve a block page instead of the real content when they detect automation. The block usually combines three signals:
HeadlessChrome or Playwrightnavigator.webdriver === true (most modern MCP browsers patch this to false)After loading any URL, check for these markers before saving the screenshot as "the real page":
| Signal | Indicates block |
|---|---|
| HTTP status 403, 429, 503 | Server-side refusal |
<title> contains Access Denied, Blocked, Forbidden, Just a moment, Attention Required, Robot Check | Bot wall |
Page body contains you have been blocked, cloudflare, please verify you are human, unusual traffic | Bot wall |
Page renders an iframe with challenges.cloudflare.com or geo.captcha-delivery.com | Active challenge |
| Body word count under ~50 on what should be a content-rich page | Likely block page |
If any of these are true, treat the page as blocked and walk the fallback ladder before declaring success.
Every tier aims at the live content. If one tier is not viable, fall to the next - never settle for an archive.
The most reproducible "real user" path. Works on macOS, Linux, and Windows. Needs a one-time install with user approval.
# One-time install (ask user first):
pipx install playwright # or: pip install --user playwright
playwright install chromium # download the browser binary
# Stealth plugin to mask remaining automation signals:
pipx inject playwright tf-playwright-stealth
Then use the bundled scripts/capture_with_stealth.py helper - it launches Chromium headed (not headless), applies the stealth patches that hide navigator.webdriver, browser plugins, language list, and chrome.runtime, and saves the screenshot. The browser uses the user's home IP, defeating the datacenter-IP check that drives most refusals.
Why this is Tier 1: portable, scripted, no user interaction during capture, uses the user's normal network.
When the user has Chrome already open with their real profile (cookies, extensions, logged-in sessions), a CDP attach reuses that exact browser. The bot wall sees the user's actual Chrome - same fingerprint, same cookies, same IP - no second browser process and no second profile.
Setup (one-time, user relaunches Chrome with the flag):
macOS: /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
Linux: google-chrome --remote-debugging-port=9222
Windows: "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
Pick one of two attach paths:
Path 2a - dedicated CLI (preferred when npm is available):
The agent-browser package wraps the CDP attach pattern with auto-discovery of any Chrome running with a debug port, named element refs from the accessibility tree, and a wide command surface (snapshot, click, fill, screenshot, eval, get-box). Install once:
npm install -g agent-browser
Then any agent command works against the user's live Chrome:
agent-browser --auto-connect open https://example.com
agent-browser --auto-connect screenshot out.png
agent-browser --auto-connect screenshot --selector "article" cropped.png
--auto-connect finds the running Chrome without hard-coding the port. Use this when one tool needs to navigate, locate elements, and capture as a single workflow.
Path 2b - bundled Python script (fallback when npm is not available):
python scripts/connect_to_running_chrome.py https://example.com out.png
Same CDP attach, fewer commands. Useful when npm cannot be installed in the environment.
Why Tier 2 not Tier 1: it needs the user to relaunch Chrome with a debug flag, and a Chrome running with --remote-debugging-port open is a mild local security risk if untrusted processes share the machine.
When neither Playwright nor a CDP-debug Chrome is on the table, fall back to the OS's own tools. This tier needs nothing the system does not ship with - but it is brittle, steals focus, and the shot will include whatever the user's screen happens to show (browser chrome, bookmarks bar, possibly other windows).
The bundled scripts/capture_via_system_browser.sh dispatches by platform:
| Platform | Open URL | Capture |
|---|---|---|
| macOS | osascript -> Google Chrome new tab; reads window bounds | screencapture -R<x,y,w,h> out.png |
| Linux (X11) | xdg-open <URL> or google-chrome <URL> | import -window root out.png (ImageMagick) / scrot / gnome-screenshot -f out.png |
| Linux (Wayland) | xdg-open <URL> | grim out.png |
| Windows | PowerShell Start-Process chrome <URL> | PowerShell Add-Type System.Drawing screen-capture |
Permissions to know about:
screencapture until the calling terminal has Screen Recording permission. The script reports the missing permission instead of failing silently.grim, ImageMagick, scrot, or gnome-screenshot); the script picks the first one it finds.Start-Process.Why Tier 3 is the last resort: it depends on the user's current screen state, requires OS-level permissions that may need a one-time grant, and the resulting capture may include UI chrome the user has to crop manually. Use it only when Tiers 1 and 2 cannot run.
Out-of-scope - do not auto-apply, surface to the user instead:
| Strategy | Why we do not auto-apply |
|---|---|
| Residential proxy | Costs money and routes traffic through someone else's IP - needs explicit approval and a legitimate use case |
| CAPTCHA-solving service | Defeats a security control the site put there on purpose - report back, do not work around silently |
| Stealth + scraping protected content | Just because the wall can be bypassed does not mean the site permits it - check terms before pushing past explicit refusals |
The skill captures the block page faithfully when blocked - that is a real result, not a tool failure. After any escalation, always tell the user:
| Mode | Command | When to use |
|---|---|---|
| Viewport | shot-scraper URL -o out.png | Hero shots, above-the-fold, social cards |
| Full page | shot-scraper URL -o out.png --width 1280 --height 800 then add --full-page style via JS, or use shotscraper URL -o out.png -h 0 to capture from y=0 to bottom | Long landing pages, articles, full-doc captures |
| Element | shot-scraper URL -o out.png --selector "main .hero" | Specific component, isolated UI, design-system docs |
| Multiple selectors | --selector "header" --selector "footer" (one per shot, via batch) | Multi-region capture in one run |
| JPEG (smaller) | -o out.jpg --quality 80 | Documentation, social previews, anywhere PNG is overkill |
| Retina (2x) | --retina | High-DPI displays, marketing assets |
Full-page caveat: pages with position: sticky headers may render duplicate bands in WebKit/Firefox full-page mode. Test before committing to a full-page shot - prefer viewport + scroll-and-stitch via custom JS if sticky headers misbehave.
Playwright (and shot-scraper) load pages until the load event by default. That is enough for static pages. For dynamic content, layer on additional waits.
| Strategy | Flag | When |
|---|---|---|
| Wait for selector | --wait-for "document.querySelector('.loaded')" | Wait until a DOM marker appears |
| Wait fixed time | --wait 2000 (ms) | Last resort for "just give it 2 seconds" cases |
| Custom JS readiness | --wait-for "window.__myAppReady === true" | App sets a global when ready |
| Skip JavaScript | --javascript false | Static HTML only, fastest, no JS runs |
Do not use networkidle. Playwright's own team discourages it - pages with analytics pings or long-poll connections never settle. Wait for something deterministic instead.
See references/wait-strategies.md for deeper patterns: SPA route changes, image-decode waits, font-load waits.
For any shot you want to compare across runs or reuse in docs:
shot-scraper https://example.com -o out.png \
--reduced-motion \
--wait-for "document.fonts.ready"
| Option | What it does |
|---|---|
--reduced-motion | Sets prefers-reduced-motion: reduce, freezes most CSS animations |
document.fonts.ready | Wait for web fonts to finish loading - prevents text-shift artifacts |
--javascript-disable | Mask cookie banners or chat widgets with custom JS that hides them before the shot |
Hide a cookie banner before capturing:
shot-scraper https://example.com -o out.png \
--javascript "document.querySelector('#cookie-banner')?.remove();"
shot-scraper https://example.com -o mobile.png \
--width 393 --height 852 \
--user-agent "Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X)"
shot-scraper https://example.com -o dark.png \
--javascript "document.documentElement.style.colorScheme = 'dark';"
For sites that respect prefers-color-scheme, the cleanest path is shot-scraper's --media option or a CDP override. See references/options.md.
shot-scraper https://example.com -o desktop.png --width 1280 --height 800
shot-scraper https://example.com -o mobile.png --width 393 --height 852
Create shots.yaml:
- url: https://example.com
output: example.png
width: 1280
height: 800
- url: https://news.ycombinator.com
output: hn.png
selector: ".athing:first-child"
Run:
shot-scraper multi shots.yaml
Best for nightly captures, doc-site galleries, or capturing many components at once.
shot-scraper https://example.com -o feature.png \
--selector "#features" \
--javascript "document.querySelector('#features').scrollIntoView({block:'start'});"
shot-scraper covers ~90% of cases. Reach for raw Playwright when the flow needs:
| Need | Why shot-scraper falls short |
|---|---|
| Multi-step navigation (login -> dashboard -> screenshot) | shot-scraper is single-page-shot oriented |
| Form interaction before shot | Cannot fill complex multi-step forms |
| Multiple shots from one session | Each shot-scraper call boots a fresh browser |
| Visual regression suite | Use @playwright/test with built-in toHaveScreenshot() matcher |
| Auth via OAuth flows | Multi-redirect auth is awkward in single-command CLI |
Minimal Playwright Python fallback:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page(viewport={"width": 1280, "height": 800})
page.goto("https://example.com")
page.fill("#email", "user@example.com")
page.click("button[type=submit]")
page.wait_for_selector(".dashboard")
page.screenshot(path="dashboard.png", animations="disabled")
browser.close()
Install: pip install playwright && playwright install chromium.
Five bundled helpers, ordered by what they replace in the cascade.
| Script | Role in cascade | Install needed | Platforms |
|---|---|---|---|
scripts/copy_mcp_capture_to_host.sh | Post-step for Path A - copy MCP container files to the host. Required on every MCP capture. | docker CLI | All (where Docker runs) |
scripts/capture_with_stealth.py | Tier 1 anti-bot fallback - Playwright headed + stealth | Playwright | macOS, Linux, Windows |
scripts/connect_to_running_chrome.py | Tier 2 anti-bot fallback - attach to user's running Chrome via CDP | Playwright (no browser download) | macOS, Linux, Windows |
scripts/capture_via_system_browser.sh | Tier 3 anti-bot fallback - system-native browser + OS screen capture | None (OS built-ins) | macOS, Linux, Windows |
scripts/multi_viewport.py | Multi-device convenience wrapper | shot-scraper | All |
Was the headless capture blocked or refused?
|
No -> done. Save the shot. Copy to host if it lives inside a container.
|
Yes -> escalate to Tier 1 (capture_with_stealth.py).
Playwright installable (with user OK)?
Yes -> install, run Tier 1.
No -> escalate to Tier 2 (connect_to_running_chrome.py or agent-browser).
User can relaunch Chrome with --remote-debugging-port?
Yes -> Tier 2.
No -> escalate to Tier 3 (capture_via_system_browser.sh).
OS permissions in place (Screen Recording on macOS,
a capture tool installed on Linux, etc.)?
Yes -> Tier 3.
No -> stop and report. Do not silently work around explicit refusals.
Capture a live page that requires a real-user fingerprint (Tier 1):
python scripts/capture_with_stealth.py https://example.com out.png \
--width 1280 --height 800 --wait-for "document.fonts.ready"
Capture using the user's existing Chrome session (Tier 2):
# Once, in another terminal:
google-chrome --remote-debugging-port=9222
# Then:
python scripts/connect_to_running_chrome.py https://example.com out.png
Last-resort system-native capture (Tier 3, no extra install):
./scripts/capture_via_system_browser.sh https://example.com out.png --wait 8
Multi-device capture (no anti-bot needs):
python scripts/multi_viewport.py https://example.com --out screenshots/
WebFetch or curl confirms the URL returns real content (status 200, expected title, body word count). Opening a browser only to land on a 404 or a wrong redirect wastes a session and steals user focus.shot-scraper install has been run on this machine (if using shot-scraper)networkidle--reduced-motion)--javascript--quality for big captures)references/platform-selectors.md| Reference | Contents |
|---|---|
| options | Complete shot-scraper option reference: viewport, device emulation, headers, auth, media queries, JS injection, output formats |
| wait-strategies | Deep dive on waiting: SPA route changes, font readiness, image decode, custom readiness signals, debugging slow-loading pages |
| platform-selectors | CSS selectors by content type and platform (articles, social posts, comments, code, video, products); verification routine; shadow-DOM tactics |
| Situation | Recommended Skill |
|---|---|
| Apply frames, annotations, or compositing to the captured image | (future) image-framing tool skill |
| Diff screenshots across runs for visual regression | testing knowledge skill (component vs page strategy) |
| Capture as part of an end-to-end test suite | Use @playwright/test directly - this skill is for ad-hoc captures |
| Accessibility audit of the captured page | accessibility knowledge skill |
scripts/copy_mcp_capture_to_host.sh (or an equivalent docker cp) so the user has a real file they can open, attach, edit, or commit. Local tools (shot-scraper, Playwright) already write to the host; the rule applies to any tool that writes inside a container or remote sandbox.pip install, brew install, playwright install chromium, and shot-scraper install. Each one needs explicit user approval. Surface what is missing; let the user pick.shot-scraper install runs once per machine before any shot-scraper capture - the browser is not bundled with the pip package. Same for playwright install chromium.networkidle - waits never terminate on pages with persistent connections. Wait on a selector, function, or fixed delay.