Capture web content (articles, prices) from the user's localhost so requests originate from the user's own IP — used to reach geo-restricted or region-locked sites (Israeli news, Israeli e-commerce). Headless-first escalation ladder: Scrapling (static → stealth → Playwright) → bb-browser for auth. Saves normalized output to ~/local-web-capture/ and supports capture+translate (Hebrew→English default, arbitrary pairs).
npx claudepluginhub danielrosehill/claude-code-plugins --plugin Local-Web-CaptureCapture a list of URLs in one job. Saves each article individually with a timestamp, then produces a summary report across the batch (markdown always; PDF optional via Typst). Supports two report modes: human-summary (readable prose + per-item cards) and agent-summary (JSON optimised for LLM consumption). Use when the user supplies multiple URLs — pasted, in a file, or from clipboard — and wants both the individual captures and a synthesised overview.
Compile a batch of captured articles into a single PDF via Typst. Each article gets a header block showing the source URL and capture date, a page-numbered footer runs throughout, and a cover + table of contents open the document. Use when the user wants a printable/shareable compendium of a scrape batch, says "make a PDF of these captures", "compile the batch to PDF", or "print the batch report".
Browse, search, or summarise what's in the local-web-capture data directory (~/local-web-capture/). Use when the user asks "what did I capture recently", "find my captures of X", "list articles from ynet", or wants a table of recent captures.
Capture an article from a URL (via the normal localhost scrape path) and save a translated version alongside the original. Default language pair is Hebrew → English; supports arbitrary source → target. Saves to ~/local-web-capture/translations/. Use when the user shares a Hebrew article URL and wants the English version, says "capture and translate", or passes a URL in a language they want rendered into another.
Scrape an article from a URL via the user's localhost (preserves Israeli IP for geo-restricted sites) and save a clean markdown capture. Use when the user gives a URL and wants the article body saved, or says "capture this article", "scrape this", or references Israeli news sites (ynet, haaretz, n12, calcalist, timesofisrael, etc.).
Tier-3 escalation — capture an article or page from a paywalled, logged-in, or aggressively-gated site by driving the user's real Chrome via bb-browser (login state + cookies + user's IP). Use ONLY when scrape-article reports that headless rungs failed because of auth/paywall (e.g. Haaretz, Calcalist+, subscriber-only Israeli content). Do not use for ordinary scrapes — the headless path is cheaper.
Capture product name + price (and optional unit price, stock, SKU) from an Israeli e-commerce URL via the user's localhost, saving a structured JSON snapshot under ~/local-web-capture/prices/<domain>/. Use when the user wants to track a price on Yad2, Shufersal, Rami Levy, KSP, Ivory, Zap, or similar, or asks to "grab the price" / "snapshot this listing".
General-purpose web page scraper via the user's localhost (preserves Israeli IP for geo-restricted sites). Handles any page that is NOT a clean article — SPAs, branch locators, product catalogs, search result pages, government portals, dashboards, API endpoints discovered via devtools, or anything requiring raw HTML, rendered DOM, network/XHR responses, or structured extraction. Supports scripted user interaction (click "load more", scroll, type into search, iterate a city dropdown) for sites that hide data behind UI actions — a naive "scrape this URL" will return nothing on those. Always tries to identify and call an unauthenticated backend JSON API first — UI automation is the fallback, not the default. Canonical user invocation provides a triple: (URL, interactive target to click/type/scroll, data to extract) — the skill loads the URL, activates the target in a loop until it stops producing new content, then extracts the requested data. Trigger phrases: "scrape this page", "dump the HTML", "find the API behind this page", "capture this SPA", "get all the X from this page", "extract the JSON", "scrape branch list / store list / catalog", "load URL, click this, then extract that", "I had to click to load everything", any non-article URL. For clean article extraction use `scrape-article` instead.
First-run / re-check setup for Local-Web-Capture. Verifies the data directory exists, installs the Rung-1 headless extractor (webclaw), sets up the Rung-4 real-browser path (bb-browser), seeds sites.yaml, and flags Playwright MCP for removal once validated. Use when the user wants to initialise the plugin, when a capture skill reports a missing dependency, when onboarding a new machine, or to re-check the install.
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimClaude Code plugin for capturing web content from the user's own machine — so every request exits via the user's IP. Built for geo-restricted content (seeded with Israeli news + e-commerce) where hosted scrapers get blocked at the edge.
Cheapest method that works wins. Escalation ladder:
| Rung | Method | Tool | When |
|---|---|---|---|
| 1 | Headless, static HTTP | Scrapling Fetcher | Default. Server-rendered articles, simple prices. |
| 2 | Headless, stealth | Scrapling StealthyFetcher (Camoufox) | Bot-blocked / Cloudflare. |
| 2b | Headless, JS render | Scrapling PlayWrightFetcher | Needs JS, no bot block. |
| 3 | Real Chrome, logged in | bb-browser | Paywalled / session-bound. |
Backups (installed, not default): webclaw, lightpanda, camofox-browser, stealth-browser-mcp, Playwright MCP.
Per-domain choice is cached in sites.yaml so the right rung is picked on repeat captures.
setup-tools — one-time install of Scrapling, bb-browser, data dir seeding.scrape-article — Rungs 1–3, save markdown capture.scrape-authenticated — Rung 4 only, real Chrome via bb-browser.scrape-price — structured price snapshot for e-commerce URLs.capture-translate — scrape + save a translated version (default he→en; arbitrary pairs supported).batch-capture — list of URLs in, individual captures + human/agent summary reports out.batch-to-pdf — compile a batch into a single typeset PDF (Typst) with source URL + capture date per article and page-numbered footer.capture-list — browse/search/summarise the data directory.Resolution order (see reference/save-location.md):
--out <path> explicit override.<repo_root>/captures/. Keeps research artefacts with the project they inform.~/local-web-capture/.Layout is the same in both modes:
<root>/
├── articles/YYYY/MM/*.md # markdown + frontmatter
├── prices/<domain>/*.json # structured price snapshots
├── translations/YYYY/MM/*.md # stacked translations
├── batches/<batch-id>/ # batch-capture output
│ ├── articles/
│ ├── report.md # human-summary
│ ├── report.agent.json # agent-summary (strict schema)
│ └── report.pdf # optional, via batch-to-pdf
├── sites.yaml # per-domain strategy cache
└── .cache/
See reference/data-layout.md for the full schema.
See toolkit-options.md for the shortlist of ~22 candidate browser/extractor projects ranked by stars with verdicts.
Once the plugin is validated end-to-end, remove Playwright from user-level MCP config. If it stays globally visible, Claude will pull toward it on every scrape request regardless of what the skill ladder says. Leaving it at project level (or installing only when needed) keeps the ladder honest.
The most comprehensive Claude Code plugin — 48 agents, 182 skills, 68 legacy command shims, selective install profiles, and production-ready hooks for TDD, security scanning, code review, and continuous learning
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Binary reverse engineering, malware analysis, firmware security, and software protection research for authorized security research, CTF competitions, and defensive security
Next.js development expertise with skills for App Router, Server Components, Route Handlers, Server Actions, and authentication patterns
Claude + Google Stitch workflow toolkit with MCP integration (prompt authoring, screen generation, design extraction)