From llm-wiki
Fetch web URLs into raw/ using Firecrawl CLI/REST or stdlib fallback. Invoked by wiki-research for general web pages.
npx claudepluginhub skinnnyjay/wiki-llm --plugin llm-wikiThis skill uses the workspace's default tool permissions.
Fetches one or more URLs into `raw/` using the best available adapter. Use this sub-skill when the input is a direct URL (not arXiv, not social, not a feed).
Creates new Angular apps using Angular CLI with flags for routing, SSR, SCSS, prefixes, and AI config. Follows best practices for modern TypeScript/Angular development. Use when starting Angular projects.
Generates Angular code and provides architectural guidance for projects, components, services, reactivity with signals, forms, dependency injection, routing, SSR, ARIA accessibility, animations, Tailwind styling, testing, and CLI tooling.
Executes ctx7 CLI to fetch up-to-date library documentation, manage AI coding skills (install/search/generate/remove/suggest), and configure Context7 MCP. Useful for current API refs, skill handling, or agent setup.
Fetches one or more URLs into raw/ using the best available adapter. Use this sub-skill when the input is a direct URL (not arXiv, not social, not a feed).
For single-URL fetch without wiki merge, use wiki-fetch instead.
Compliance: access-sources-disclaimer.md
Priority 0 — Read the config first:
llm-wiki integrations status
Open llm-wiki/config.json → integrations. Only consider adapters where enabled: true AND their api_key_env is present in the environment. This is the authoritative source — skip any adapter the user has disabled, regardless of whether the key exists.
// llm-wiki/config.json (reference — do not edit here)
"integrations": {
"firecrawl": { "enabled": true, "api_key_env": "FIRECRAWL_API_KEY" },
"brave": { "enabled": true, "api_key_env": "BRAVE_SEARCH_API_KEY" },
"perplexity":{ "enabled": true, "api_key_env": "PERPLEXITY_API_KEY" }
}
Then use the first ready and enabled adapter from this order:
| Priority | Adapter | Config key | Condition |
|---|---|---|---|
| 0 | config check | — | Read llm-wiki/config.json integrations; build enabled list |
| 1 | Brave Search | integrations.brave | enabled: true + BRAVE_SEARCH_API_KEY set — best for extractable web content |
| 2 | Firecrawl CLI | integrations.firecrawl | enabled: true + which firecrawl succeeds — cleanest markdown, JS-rendered pages |
| 3 | Firecrawl REST | integrations.firecrawl | enabled: true + FIRECRAWL_API_KEY set |
| 4 | stdlib url | — | Always available (no JS rendering) |
If the page requires JavaScript rendering and only stdlib is available, tell the user and offer to configure Firecrawl or Brave (llm-wiki integrations wizard).
For each URL, derive a slug:
https://news.ycombinator.com/item?id=43012345 → hn-43012345Output path: raw/research/<topic-slug>/<page-slug>.md
If no topic context is available, use raw/research/misc/<page-slug>.md.
# Firecrawl CLI (preferred)
llm-wiki ingest firecrawl <URL> --out research/<topic>/<slug>.md
# Firecrawl REST
llm-wiki ingest firecrawl <URL> --out research/<topic>/<slug>.md
# stdlib fallback
llm-wiki ingest url <URL> --out research/<topic>/<slug>.md
For multiple URLs, fetch them one at a time. Do not batch in a single command if the URLs are independent topics — one file per source ensures stable citations.
After each fetch:
amazon.com/dp/ or /gp/product/) → wiki-extract-ecommerceebay.com/itm/ or sold search) → wiki-extract-ecommerceetsy.com/listing/) → wiki-extract-ecommercewiki-extract-ecommerceskills/wiki-extract-paywall/SKILL.md for the full priority sequence.llm_wiki_security.prompt_injection: suspected — if flagged, follow skills/wiki-research/references/source-eval.md before proceeding.Signs of a paywall in fetched content:
Add or confirm these frontmatter fields:
source_url: https://...
fetched_date: YYYY-MM-DD
adapter: brave | firecrawl | firecrawl_rest | stdlib_url
source_type: web
paywall: none | soft | hard # fill in if bypass was attempted
bypass_method: freedium | archive.ph | wayback | removepaywall | 12ft | googlebot # if bypassed
Run each fetched file through the scoring checklist in skills/wiki-research/references/source-eval.md. Discard or quarantine files that fail the minimum quality bar.
Return to wiki-research Step 3 (post-process). Do not run wiki-ingest directly — the orchestrator handles that.
raw/ file passing minimum length/quality checks; failures are noted with fallback attempts.| Symptom | Fix |
|---|---|
| Empty body from stdlib | Page needs JS; configure Firecrawl |
| Body < 200 words / subscription prompt | Paywall detected; invoke wiki-extract-paywall |
| Medium article paywalled | Use Freedium: https://freedium.cfd/<URL> (see wiki-extract-paywall) |
firecrawl: command not found | npm install -g firecrawl-cli then firecrawl login --browser |
Missing FIRECRAWL_API_KEY | export FIRECRAWL_API_KEY=fc-... or run llm-wiki integrations wizard |
| 403 / paywalled page | Invoke wiki-extract-paywall for bypass sequence |
| YouTube URL | Use wiki-extract-youtube instead — this skill does not handle video |
| Amazon / eBay / Etsy / auction URL | Use wiki-extract-ecommerce — product pages need structured extraction |
| GitHub repo URL | Use wiki-extract-github — handles README, releases, issues properly |
| Substack / Beehiiv / Ghost URL | Use wiki-extract-newsletter — newsletter-specific bypass patterns |
| Wikipedia article URL | Use wiki-extract-wikipedia — MediaWiki API gives cleaner output |
| Crunchbase / LinkedIn URL | Use wiki-extract-crunchbase or wiki-extract-linkedin respectively |
| Patent URL (Google Patents, Espacenet) | Use wiki-extract-patents — structured patent extraction |
| Firecrawl rate limit | Wait 30s; use --delay 2 flag if supported |
llm-wiki integrations status and any llm-wiki line from Step 1 of this skill (from the vault root).