Web page and website scraping with Firecrawl API. Use this skill when scraping web articles, blog posts, documentation pages, paywalled content, or JavaScript-heavy sites. Triggers on requests to scrape websites, extract article content, convert pages to markdown, or handle anti-bot protection.
From caspernpx claudepluginhub casper-studios/casper-marketplace --plugin casperThis skill uses the workspace's default tool permissions.
references/single-page.mdreferences/website-crawler.mdscripts/firecrawl_scrape.pySearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Scrape individual web pages and convert them to clean, LLM-ready markdown. Handles JavaScript rendering, anti-bot protection, and dynamic content.
What are you scraping?
│
├── Single page (article, blog, docs)
│ └── references/single-page.md
│ └── Script: scripts/firecrawl_scrape.py
│
└── Entire website (multiple pages, crawling)
└── references/website-crawler.md
└── (Use Apify Website Content Crawler for multi-page)
# Required in .env
FIRECRAWL_API_KEY=fc-your-api-key-here
Get your API key: https://firecrawl.dev/app/api-keys
python scripts/firecrawl_scrape.py "https://example.com/article"
python scripts/firecrawl_scrape.py "https://wsj.com/article" \
--proxy stealth \
--format markdown summary \
--timeout 60000
| Mode | Use Case |
|---|---|
basic | Standard sites, fastest |
stealth | Anti-bot protection, premium content (WSJ, NYT) |
auto | Let Firecrawl decide (recommended) |
markdown - Clean markdown content (default)html - Raw HTMLsummary - AI-generated summaryscreenshot - Page screenshotlinks - All links on page~1 credit per page. Stealth proxy may use additional credits.
FIRECRAWL_API_KEY in .env file (never commit to git).tmp/ directorySymptoms: API returns "insufficient credits" or quota exceeded error Cause: Account credits depleted Solution:
basic proxy mode to conserve creditsSymptoms: Empty content or partial HTML returned Cause: JavaScript-heavy page not fully loading Solution:
--js-render flag--timeout 60000 (60 seconds)stealth proxy mode for protected sites--wait-for selectorSymptoms: Script returns 403 status code Cause: Site blocking automated access Solution:
stealth proxy modeSymptoms: Scrape succeeds but markdown is empty or malformed Cause: Dynamic content loaded after page load, or unusual page structure Solution:
--wait-for to wait for specific contenthtml format to see raw contentSymptoms: Request times out before completion Cause: Slow page load or large page content Solution:
basic proxy for faster responseSkills: firecrawl-scraping → parallel-research Use case: Scrape competitor pages, then analyze content strategy Flow:
Skills: firecrawl-scraping → content-generation Use case: Create summary documents from web research Flow:
Skills: firecrawl-scraping → attio-crm Use case: Enrich company records with website data Flow: