Help us improve
Share bugs, ideas, or general feedback.
From seo-skills
Web scraping and crawling via Firecrawl MCP, returning raw HTML, metadata (og, twitter, JSON-LD), JS-rendered DOM, and screenshots. Use for tasks where WebFetch's markdown is insufficient.
npx claudepluginhub seranking/seo-skills --plugin seo-skillsHow this skill is triggered — by the user, by Claude, or both
Slash command
/seo-skills:seo-firecrawlThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Example output: [examples/seo-firecrawl-stripe-com-20260514/scrape/FIRECRAWL.md](../../examples/seo-firecrawl-stripe-com-20260514/scrape/FIRECRAWL.md)
Automates web crawling and data extraction using Firecrawl: scrape pages, crawl sites, extract structured data with AI, batch URLs, and map site structures.
Searches, scrapes, crawls, and interacts with web pages via the Firecrawl CLI. Returns clean markdown for LLM context. Useful for research, content extraction, and site downloads.
Scrapes URLs to markdown/HTML/JSON, crawls websites for multi-page extraction, searches the web, maps sites, and extracts structured data using Firecrawl MCP tools.
Share bugs, ideas, or general feedback.
Example output: examples/seo-firecrawl-stripe-com-20260514/scrape/FIRECRAWL.md
A direct interface to Firecrawl MCP for tasks that fall outside the data-driven SE Ranking skills. Use when:
<head> metadata, JSON-LD, or post-JS DOM that WebFetch's markdown conversion strips.seo-page, seo-schema, seo-content-audit, etc.) called you as a sub-step.firecrawl-mcp MCP server. If mcp__firecrawl-mcp__firecrawl_scrape is unavailable, abort with the install command — bash extensions/firecrawl/install.sh from this plugin repo, plus the firecrawl.dev signup URL (free tier 500 credits/month). Don't attempt fallbacks; this skill exists for the cases WebFetch can't cover.scrape / map / crawl / search). If mode unspecified, infer from input shape (single URL → scrape, single domain → map).firecrawl-mcp is connected. If not, surface the install command and stop.scrape — single URL, full data (default if user supplies one URL).map — single domain, list of URLs only (cheap reconnaissance).crawl — single domain, fetch each discovered page (expensive; require explicit confirm).search — query within a domain.scrape (1 credit), map (~0.5 credit per discovered URL — estimate using a map first if scope unclear), crawl (1 credit per page crawled), search (1 credit per result returned).crawl and for map of >50 expected URLs, surface the estimate and require explicit go-ahead before calling.creditsUsed / creditsRemaining in metadata).mcp__firecrawl-mcp__firecrawl_* tool.
scrape: pass formats: ["markdown", "html"] by default (markdown for prose, html for <head> + JSON-LD). Add formats: ["screenshot"] only if the deliverable visibly uses one. SPAs: pass waitFor: 2000 (or a CSS selector) so the JS-rendered DOM is captured. Default onlyMainContent: true to drop nav/footer noise — override only on explicit request.map: default limit: 500 (hard cap). Pass excludePaths: ["/admin/*", "/api/*", "/wp-admin/*", "/feed/*"] as a sane default.crawl: default limit: 50 (default cap), hard cap limit: 200. Always pass excludePaths to prune. Poll firecrawl_check_crawl_status if the job returns asynchronously.search: default limit: 20.scrape → RAW.md (markdown body), META.md (og / twitter / canonical / robots / headers + parsed JSON-LD @type list with hashes), links.csv, optional screenshot.png.map → URLS.md with pattern-grouped list (e.g., /blog/* — 128 (37%), /products/* — 84 (24%)), plus urls.csv.crawl → folder per page under pages/{slugified-url}/ with RAW.md + META.md, plus a top-level INDEX.md summarising every page (URL, status, key signals).search → MATCHES.md with hit excerpts + URLs ranked by relevance.FIRECRAWL.md at the root: target, mode, credits used, key findings (5 bullets max), open loops, recommended next skill.Folder seo-firecrawl-{slug}-{YYYYMMDD}/:
seo-firecrawl-{slug}-{YYYYMMDD}/
├── RAW.md (markdown body)
├── META.md (og / twitter / canonical / robots / headers + parsed JSON-LD)
├── links.csv (every <a href> on the page)
├── screenshot.png (optional; only if requested)
└── FIRECRAWL.md (synthesis + handoff payload)
seo-firecrawl-{slug}-{YYYYMMDD}/
├── URLS.md (pattern-grouped URL list)
├── urls.csv (every URL with discovery depth, if available)
└── FIRECRAWL.md
seo-firecrawl-{slug}-{YYYYMMDD}/
├── INDEX.md (every page + status code + key signals)
├── pages/
│ ├── {slug-1}/RAW.md
│ ├── {slug-1}/META.md
│ ├── {slug-2}/RAW.md
│ └── ...
└── FIRECRAWL.md
seo-firecrawl-{slug}-{YYYYMMDD}/
├── MATCHES.md (hit excerpts + URLs ranked by relevance)
└── FIRECRAWL.md
FIRECRAWL.md follows this shape:
# Firecrawl: {target}
> Run dated {YYYY-MM-DD} · Mode: {scrape | map | crawl | search} · Credits used: {n}
## Summary
{One-paragraph what-came-back. Example: "Scraped https://example.com/article. og:title and og:image present, JSON-LD Article schema with author + datePublished. 12 outbound links. Page is server-rendered (no JS-render divergence). Robots: index,follow."}
## Key findings
1. {Finding anchored in concrete data}
2. ...
5. ...
## Open loops
- {What this run did NOT answer}
- ...
## Recommended next step
{One of: `seo-page` (when a single URL was scraped and now wants performance analysis) | `seo-schema` (when JSON-LD audit needs follow-up generation) | `seo-technical-audit` (when crawl revealed broken pages) | `seo-content-audit` (when crawl produced a corpus to audit) | `seo-drift baseline` (when the user wants to track this URL over time) | "this completes the user's ask".}
## Handoff payload
- **Produced by:** seo-firecrawl
- **Target:** {url or domain}
- **Mode:** {scrape | map | crawl | search}
- **Credits used:** {n}
- **Key findings:** {5 bullets — e.g., "twitter:card present (summary_large_image)", "JSON-LD types: Article + Organization + BreadcrumbList", "robots: index,follow", "canonical self-referencing", "404s: 0 of 50 pages crawled"}
- **Open loops:** {what this didn't answer}
- **Recommended next skill:** {seo-page | seo-schema | seo-technical-audit | seo-content-audit | …} — {one-line why}
map is 0.5 cr/URL; scrape is 1 cr each; crawl 1 cr/page. Surface cost up front; warn when a single run will eat >100 credits.onlyMainContent: true for scrape to drop nav/footer noise. Override only if the user explicitly asks for full-page DOM.waitFor (CSS selector or ms) for SPAs that lazy-load content. 2000ms is a sensible default; selectors are more reliable than time waits.firecrawl_map before firecrawl_crawl when crawl scope is unclear — discover first, decide what to crawl, then crawl. Saves credits.includePaths / excludePaths dramatically cut crawl cost. Always pass excludePaths: ["/admin/*", "/api/*", "/wp-admin/*", "/feed/*"] as a default.formats: ["screenshot"] unless the deliverable visibly uses it. It doubles per-page cost.firecrawl_extract or firecrawl_deep_research. Both overlap with our own LLM analysis; firecrawl_extract has opaque pricing on the free tier; both are explicitly out of scope for seo-skills.seo-page, seo-schema, etc.), drop the FIRECRAWL.md synthesis — the caller wants the raw META.md / RAW.md. Skip mode-2's URLS.md summary too if the caller wants the raw urls.csv.seo-page for keyword/traffic verdicts on one URL, seo-schema for JSON-LD work, seo-technical-audit for crawl-wide issues — use those instead. They orchestrate Firecrawl plus SE Ranking data automatically.seo-page — when a single URL was scraped and now wants keyword/traffic verdicts.seo-schema — when JSON-LD audit produced gaps that need generation.seo-technical-audit — when a crawl revealed broken pages or noindex issues at scale.seo-content-audit — when a crawl produced a corpus to E-E-A-T-audit.seo-drift baseline — when the user wants to track this URL or domain over time.