From b00t
Scrapes URLs to markdown/HTML/JSON, crawls websites with BFS, searches web, maps URLs, and extracts structured data via Firecrawl MCP for AI agents.
npx claudepluginhub elasticdotventures/_b00t_ --plugin skill-document-understandingThis skill is limited to using the following tools:
Enables AI agents to interact with web content through the Firecrawl MCP server. Core capabilities:
Scrapes webpages to markdown, takes screenshots, extracts structured data, searches web, and crawls sites like documentation using Firecrawl API. Use for fetching live web content or framework docs.
Search, scrape, crawl, and interact with websites using Firecrawl CLI. Extracts clean markdown from URLs, researches topics, handles dynamic pages with clicks or logins.
Scrapes single pages or crawls sites using Firecrawl v2.5 API to LLM-ready markdown and structured data. Handles JS rendering, bot bypass, browser automation for dynamic content extraction.
Share bugs, ideas, or general feedback.
Enables AI agents to interact with web content through the Firecrawl MCP server. Core capabilities:
Activate this skill when you see phrases like:
# Get API key from firecrawl.dev/app/api-keys
# Add to MCP config:
b00t mcp add firecrawl -- npx -y firecrawl-mcp
# Set env: FIRECRAWL_API_KEY=fc-YOUR_KEY
# Single binary, 6MB RAM, no server needed
curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | CRW_BINARY=crw sh
# Add to MCP config:
b00t mcp add crw -- npx crw-mcp
# No API key required in embedded mode
| Tool | Best For | Returns |
|---|---|---|
firecrawl_scrape | Single URL extraction | markdown/JSON/HTML |
firecrawl_batch_scrape | Multiple known URLs | markdown[] |
firecrawl_map | Discover URLs on site | URL[] |
| Tool | Best For | Returns |
|---|---|---|
firecrawl_search | Find info across web | results[] |
firecrawl_crawl | Multi-page extraction | markdown[] |
| Tool | Best For | Returns |
|---|---|---|
firecrawl_extract | Structured data extraction | JSON (schema-defined) |
firecrawl_agent | Autonomous research | JSON (async) |
firecrawl_interact | Click/navigate pages | execution result |
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://docs.example.com/api",
"formats": ["markdown"],
"onlyMainContent": true
}
}
{
"name": "firecrawl_scrape",
"arguments": {
"url": "https://example.com/product",
"formats": [{
"type": "json",
"prompt": "Extract product details",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
"inStock": {"type": "boolean"}
}
}
}]
}
}
{
"name": "firecrawl_search",
"arguments": {
"query": "best practices for async Rust 2025",
"limit": 5,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": true
}
}
}
{
"name": "firecrawl_crawl",
"arguments": {
"url": "https://docs.example.com/*",
"maxDepth": 2,
"limit": 50
}
}
Know exact URL? → scrape (single) or batch_scrape (multiple)
Need to find URLs? → search (web) or map (site discovery)
Need all pages? → crawl (with limits!)
Want specific data? → scrape with JSON format + schema
Complex research? → agent (async, poll for results)
| Metric | CRW | Firecrawl Self-Host |
|---|---|---|
| RAM | 6 MB | 4 GB+ |
| Containers | 0 | 5+ |
| Cold Start | 85ms | 30-60s |
| Setup | Single binary | Docker Compose |
# CRW embedded mode - no server, no config
npx crw-mcp
# CRW cloud mode - adds web search
CRW_API_URL=https://fastcrw.com/api CRW_API_KEY=xxx npx crw-mcp
git clone https://github.com/firecrawl/firecrawl
cd firecrawl
docker compose up -d
# MCP config:
FIRECRAWL_API_URL=http://localhost:3002 npx -y firecrawl-mcp
onlyMainContent: true to skip navigation/footerlimit on crawls to avoid context overflow"API key required" - Set FIRECRAWL_API_KEY env var or use CRW in embedded mode
"Rate limited" - Increase FIRECRAWL_RETRY_MAX_ATTEMPTS, wait and retry
"Context too large" - Use JSON format with schema, reduce crawl depth
"Self-hosted connection refused" - Verify Docker containers are running, check FIRECRAWL_API_URL