Help us improve
Share bugs, ideas, or general feedback.
From second-claude-code
Unblocks 4xx/WAF/captcha/JS-SPA web fetches via escalating free chain: public APIs, Jina Reader, curl/TLS impersonation, Playwright headless, archives until valid body. Zero keys.
npx claudepluginhub unclejobs-ai/second-claude-code --plugin second-claude-codeHow this skill is triggered — by the user, by Claude, or both
Slash command
/second-claude-code:unblockThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> No API keys. No signup. No config. Every phase runs on public endpoints,
engine/chain.mjsengine/cleaners/brunch.mjsengine/cleaners/index.mjsengine/cleaners/naver.mjsengine/cleaners/tistory.mjsengine/cli.mjsengine/detect.mjsengine/install.mjsengine/intent.mjsengine/orchestrator.mjsengine/probes/archive.mjsengine/probes/curl.mjsengine/probes/impersonate.mjsengine/probes/jina.mjsengine/probes/keyword.mjsengine/probes/lightpanda.mjsengine/probes/native-clean.mjsengine/probes/paid-api.mjsengine/probes/playwright.mjsengine/probes/public-api.mjsAutomatically bypasses WAF/bot blocks on sites like Twitter, Reddit, YouTube, GitHub via yt-dlp for media, public APIs, Jina Reader, curl_cffi TLS impersonation, and Playwright Chrome fallback. Use when WebFetch fails with 402/403.
Scrapes web pages via Scrape.do API to bypass blocks, CAPTCHA, and WebFetch errors like 403, 401, 429, timeouts, access denied, Cloudflare. Auto-activates on failures.
Extracts data from JS-rendered or anti-bot-protected web pages using scrapling Python library's tiered fetchers (HTTP, stealth Chromium, browser automation) and CSS selectors. Use when WebFetch fails for dynamic or protected content.
Share bugs, ideas, or general feedback.
No API keys. No signup. No config. Every phase runs on public endpoints, rate-limited free tiers, or self-hosted binaries that auto-install. Paid providers exist only behind explicit
--allow-paid.
First HTTP 200 is not success. Validate the body first.
A 200 with a Cloudflare challenge HTML, a login wall, or an empty SPA shell
is failure. The chain stops only when content passes engine/validate.mjs.
| Phase | Probe | Cost | Auto without keys? |
|---|---|---|---|
| 0a | Public APIs (Reddit, HN, arXiv, Bluesky, GitHub, NPM, oEmbed) | free | yes |
| 0b | Jina Reader (r.jina.ai) | free at 20 RPM | yes |
| 0c | yt-dlp metadata + subtitles (1800+ media sites) | free | yes (auto-install) |
| 0d | Native body cleaners (Naver, Tistory, Brunch) | free | yes |
| 1 | curl with rotating UA × headers × URL transforms | free | yes |
| 2 | curl-impersonate (TLS spoof) + cookie warming + referrer chain | free | yes (auto-install) |
| 3 | LightPanda headless | free | yes (auto-install) |
| 4 | Playwright real Chrome | free | yes (auto-install) |
| 5 | Free archives: Wayback + archive.today + AMP + RSS + OG rescue | free | yes |
| 6 | Optional paid (Tavily / Exa / Firecrawl) | paid | needs --allow-paid |
The chain stops at the first probe whose body passes validate.mjs. Phase 6
never runs implicitly even with env keys present. Phase 0d and URL
normalization detail: references/native-cleaners.md.
/second-claude-code:researchNot for plain keyword search (use Jina Search via s.jina.ai) or bulk crawls.
node skills/unblock/engine/cli.mjs "<URL>"
node skills/unblock/engine/cli.mjs "<URL>" --device mobile --json
node skills/unblock/engine/cli.mjs "<URL>" --selector "article" --trace
node skills/unblock/engine/cli.mjs "<URL>" --max-phase 4
Exit codes: 0 success, 1 exhausted, 2 invalid input.
Dispatched by URL host pattern.
| Pattern | Endpoint | Returns |
|---|---|---|
reddit.com/r/*/comments/* | <url>.json | post + comments |
news.ycombinator.com/item?id=N | hn.algolia.com/api/v1/items/N | full thread |
arxiv.org/abs/* | export.arxiv.org/api/query | abstract + metadata |
bsky.app/profile/*/post/* | public.api.bsky.app | post + replies |
github.com/<o>/<r> and issues/PR | api.github.com | repo / issue + comments |
npmjs.com/package/<p> | registry.npmjs.org/<p> | package + readme |
pages exposing <link rel="alternate" type="application/json+oembed"> | discovered URL | oEmbed JSON |
When Phases 1–4 all fail, the chain tries free mirrors: Wayback Machine,
archive.today, AMP cache, RSS/Atom feed discovery, OG-tag rescue. Results
carry via_archive: true; OG-rescue results also carry partial: true. See
references/archive-fallbacks.md.
curl-impersonate does not just spoof TLS. The probe runs a 2-hop sequence:
https://<host>/ with empty referrer → captures Set-CookieDefeats homepage-cookie gates and referrer-checking news sites without per-site rules.
Each retry rotates User-Agent and: Accept-Language matched to TLD,
Sec-Ch-Ua / Sec-Ch-Ua-Mobile / Sec-Ch-Ua-Platform (Chrome 131 client
hints), Sec-Fetch-Dest / Mode / Site, Accept-Encoding: gzip, deflate, br, zstd. WAFs gating on missing client hints fall here without escalation.
R1 — No site-name hardcoding. engine/profiles.json and code under
engine/** must not contain brand-specific selectors or domain matches
except the Phase 0a public-API allowlist (which routes to public APIs, not
selector scraping). Site-specific scraping hints flow only via --user-hint.
Enforced by tests/skills/unblock/no-brand-hardcode.test.mjs.
R2 — Validate before declare. Every probe result passes through
validate.mjs's 4-layer check (status, body length, challenge body,
content-type) before being returned.
R3 — Auto-install on miss. Missing binary triggers a one-time install attempt. If install fails, skip to next phase. Never block the chain.
R4 — Trace is mandatory on failure. Failure mode always returns a JSON trace listing phase outcomes, error codes, elapsed times.
R5 — Read trace before retry. Identical retries are forbidden. Adjust
--device, --selector, or --user-hint based on trace.
R6 — Paid is opt-in. Phase 6 never runs without explicit --allow-paid.
R7 — Single-user CLI, not a service. This skill fetches arbitrary
user-supplied URLs without internal-IP filtering. Do not expose its CLI
behind a network endpoint without adding host allowlisting upstream — it can
otherwise be used to probe 169.254.169.254 and other private ranges (SSRF).
Eevee invokes Unblock automatically when WebFetch returns 4xx/5xx, an
under-200-char body, a known challenge signature, or content-type mismatch.
See references/eevee-flow.md.
Returns JSON with ok, url, phase, probe, elapsed_ms, content,
title, meta, trace. Trace records every probe outcome. On exhaustion,
content may still be present with partial: true if Phase 5 OG-rescue
caught anything.
All optional. Defaults work zero-key.
| Variable | Effect |
|---|---|
JINA_API_KEY | Lifts Jina Reader 20 RPM cap |
TAVILY_API_KEY / EXA_API_KEY / FIRECRAWL_API_KEY | Phase 6 only with --allow-paid |
UNBLOCK_TIMEOUT_MS | Per-probe timeout (default 15000) |
UNBLOCK_MAX_PHASE | Cap chain (default 5) |
UNBLOCK_CACHE_DIR | Cookie + binary cache (default ~/.cache/unblock) |
UNBLOCK_SKIP_NETWORK_TESTS | Skip network-touching tests |
UNBLOCK_ALLOW_PRIVATE_HOSTS | Set 1 to disable SSRF guard (private/loopback/metadata). |
engine/
cli.mjs argv, exit code, output
chain.mjs orchestrator, escalation
detect.mjs challenge / WAF signatures
validate.mjs 4-layer success validator
transforms.mjs URL variants + locale-aware headers
intent.mjs URL vs keyword router
cookie-jar.mjs in-memory Set-Cookie jar
install.mjs auto-install binaries
profiles.json UAs, transforms, validator thresholds
probes/{public-api,jina,yt-dlp,curl,impersonate,lightpanda,playwright,archive,paid-api}.mjs
references/{waf-detection,tls-impersonation,archive-fallbacks,eevee-flow}.md