Convert websites into LLM-ready markdown with Firecrawl v4 API. Handles JavaScript rendering, anti-bot bypass, and structured data extraction for RAG and AI applications. Use when: scraping websites, crawling sites, or troubleshooting content not loading, JavaScript rendering, or bot detection.
Converts websites into LLM-ready markdown using Firecrawl API with JavaScript rendering and anti-bot bypass.
/plugin marketplace add jezweb/claude-skills/plugin install jezweb-tooling-skills@jezweb/claude-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
README.mdreferences/common-patterns.mdreferences/endpoints.mdtemplates/firecrawl-crawl-example.pytemplates/firecrawl-scrape-python.pytemplates/firecrawl-scrape-typescript.tstemplates/firecrawl-worker-fetch.tsStatus: Production Ready ✅ Last Updated: 2026-01-09 Official Docs: https://docs.firecrawl.dev API Version: v4 (firecrawl-py 4.12.0+)
Firecrawl is a Web Data API for AI that turns entire websites into LLM-ready markdown or structured data. It handles:
/v2/scrape - Single Page ScrapingScrapes a single webpage and returns clean, structured content.
Use Cases:
Key Options:
formats: ["markdown", "html", "screenshot"]onlyMainContent: true/false (removes nav, footer, ads)waitFor: milliseconds to wait before scrapingactions: browser automation actions (click, scroll, etc.)/v2/crawl - Full Site CrawlingCrawls all accessible pages from a starting URL.
Use Cases:
Key Options:
limit: max pages to crawlmaxDepth: how many links deep to followallowedDomains: restrict to specific domainsexcludePaths: skip certain URL patterns/v2/map - URL DiscoveryMaps all URLs on a website without scraping content.
Use Cases:
/v2/extract - Structured Data ExtractionUses AI to extract specific data fields from pages.
Use Cases:
Key Options:
schema: Zod or JSON schema defining desired structuresystemPrompt: guide AI extraction behaviorFirecrawl requires an API key for all requests.
fc-)NEVER hardcode API keys in code!
# .env file
FIRECRAWL_API_KEY=fc-your-api-key-here
# .env.local (for local development)
FIRECRAWL_API_KEY=fc-your-api-key-here
pip install firecrawl-py
Latest Version: firecrawl-py v4.12.0+ (Jan 2026)
import os
from firecrawl import Firecrawl
# Initialize client (reads FIRECRAWL_API_KEY from env automatically)
app = Firecrawl(api_key=os.environ.get("FIRECRAWL_API_KEY"))
# Scrape a single page - returns Pydantic Document object
doc = app.scrape(
url="https://example.com/article",
formats=["markdown", "html"],
only_main_content=True
)
# Access content via attributes (not .get())
print(doc.markdown)
print(doc.metadata) # Page metadata (title, description, etc.)
import os
from firecrawl import Firecrawl
app = Firecrawl(api_key=os.environ.get("FIRECRAWL_API_KEY"))
# Start crawl - returns Pydantic CrawlResult
result = app.crawl(
url="https://docs.example.com",
limit=100,
scrape_options={
"formats": ["markdown"]
}
)
# Process results - result.data is list of Document objects
for page in result.data:
print(f"Scraped: {page.metadata.source_url}")
print(f"Content: {page.markdown[:200]}...")
import os
from firecrawl import Firecrawl
app = Firecrawl(api_key=os.environ.get("FIRECRAWL_API_KEY"))
# Define schema (JSON Schema format)
schema = {
"type": "object",
"properties": {
"company_name": {"type": "string"},
"product_price": {"type": "number"},
"availability": {"type": "string"}
},
"required": ["company_name", "product_price"]
}
# Extract data
result = app.extract(
urls=["https://example.com/product"],
schema=schema,
system_prompt="Extract product information from the page"
)
print(result)
npm install @mendable/firecrawl-js
# or
pnpm add @mendable/firecrawl-js
# or use the unscoped package:
npm install firecrawl
Latest Version: @mendable/firecrawl-js v4.4.1+ (or firecrawl v4.4.1+)
import FirecrawlApp from '@mendable/firecrawl-js';
// Initialize client
const app = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_API_KEY
});
// Scrape a single page
const result = await app.scrapeUrl('https://example.com/article', {
formats: ['markdown', 'html'],
onlyMainContent: true
});
// Access markdown content
const markdown = result.markdown;
console.log(markdown);
import FirecrawlApp from '@mendable/firecrawl-js';
const app = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_API_KEY
});
// Start crawl
const crawlResult = await app.crawlUrl('https://docs.example.com', {
limit: 100,
scrapeOptions: {
formats: ['markdown']
}
});
// Process results
for (const page of crawlResult.data) {
console.log(`Scraped: ${page.url}`);
console.log(page.markdown);
}
import FirecrawlApp from '@mendable/firecrawl-js';
import { z } from 'zod';
const app = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_API_KEY
});
// Define schema with Zod
const schema = z.object({
company_name: z.string(),
product_price: z.number(),
availability: z.string()
});
// Extract data
const result = await app.extract({
urls: ['https://example.com/product'],
schema: schema,
systemPrompt: 'Extract product information from the page'
});
console.log(result);
Scenario: Convert entire documentation site to markdown for RAG/chatbot
app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))
docs = app.crawl_url(
url="https://docs.myapi.com",
params={
"limit": 500,
"scrapeOptions": {
"formats": ["markdown"],
"onlyMainContent": True
},
"allowedDomains": ["docs.myapi.com"]
}
)
# Save to files
for page in docs.get("data", []):
filename = page["url"].replace("https://", "").replace("/", "_") + ".md"
with open(f"docs/{filename}", "w") as f:
f.write(page["markdown"])
Scenario: Extract structured product data for e-commerce
const schema = z.object({
title: z.string(),
price: z.number(),
description: z.string(),
images: z.array(z.string()),
in_stock: z.boolean()
});
const products = await app.extract({
urls: productUrls,
schema: schema,
systemPrompt: 'Extract all product details including price and availability'
});
Scenario: Extract clean article content without ads/navigation
article = app.scrape_url(
url="https://news.com/article",
params={
"formats": ["markdown"],
"onlyMainContent": True,
"removeBase64Images": True
}
)
# Get clean markdown
content = article.get("markdown")
from firecrawl import FirecrawlApp
from firecrawl.exceptions import FirecrawlException
app = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))
try:
result = app.scrape_url("https://example.com")
except FirecrawlException as e:
print(f"Firecrawl error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
import FirecrawlApp from '@mendable/firecrawl-js';
const app = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_API_KEY
});
try {
const result = await app.scrapeUrl('https://example.com');
} catch (error) {
if (error.response) {
// API error
console.error('API Error:', error.response.data);
} else {
// Network or other error
console.error('Error:', error.message);
}
}
onlyMainContent: true to reduce credits and get cleaner datamap endpoint first to plan crawling strategyThe Firecrawl SDK cannot run in Cloudflare Workers due to Node.js dependencies (specifically axios which uses Node.js http module). Workers require Web Standard APIs.
✅ Use the direct REST API with fetch instead (see example below).
Alternative: Self-host with workers-firecrawl - a Workers-native implementation (requires Workers Paid Plan, only implements /search endpoint).
This example uses the fetch API to call Firecrawl directly - works perfectly in Cloudflare Workers:
interface Env {
FIRECRAWL_API_KEY: string;
SCRAPED_CACHE?: KVNamespace; // Optional: for caching results
}
interface FirecrawlScrapeResponse {
success: boolean;
data: {
markdown?: string;
html?: string;
metadata: {
title?: string;
description?: string;
language?: string;
sourceURL: string;
};
};
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
if (request.method !== 'POST') {
return Response.json({ error: 'Method not allowed' }, { status: 405 });
}
try {
const { url } = await request.json<{ url: string }>();
if (!url) {
return Response.json({ error: 'URL is required' }, { status: 400 });
}
// Check cache (optional)
if (env.SCRAPED_CACHE) {
const cached = await env.SCRAPED_CACHE.get(url, 'json');
if (cached) {
return Response.json({ cached: true, data: cached });
}
}
// Call Firecrawl API directly using fetch
const response = await fetch('https://api.firecrawl.dev/v2/scrape', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.FIRECRAWL_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: url,
formats: ['markdown'],
onlyMainContent: true,
removeBase64Images: true
})
});
if (!response.ok) {
const errorText = await response.text();
throw new Error(`Firecrawl API error (${response.status}): ${errorText}`);
}
const result = await response.json<FirecrawlScrapeResponse>();
// Cache for 1 hour (optional)
if (env.SCRAPED_CACHE && result.success) {
await env.SCRAPED_CACHE.put(
url,
JSON.stringify(result.data),
{ expirationTtl: 3600 }
);
}
return Response.json({
cached: false,
data: result.data
});
} catch (error) {
console.error('Scraping error:', error);
return Response.json(
{ error: error instanceof Error ? error.message : 'Unknown error' },
{ status: 500 }
);
}
}
};
Environment Setup: Add FIRECRAWL_API_KEY in Wrangler secrets:
npx wrangler secret put FIRECRAWL_API_KEY
Optional KV Binding (for caching - add to wrangler.jsonc):
{
"kv_namespaces": [
{
"binding": "SCRAPED_CACHE",
"id": "your-kv-namespace-id"
}
]
}
See templates/firecrawl-worker-fetch.ts for a complete production-ready example.
✅ Use Firecrawl when:
❌ Don't use Firecrawl when:
Cause: API key not set or incorrect Fix:
# Check env variable is set
echo $FIRECRAWL_API_KEY
# Verify key format (should start with fc-)
Cause: Exceeded monthly credits Fix:
onlyMainContent: true to reduce creditsCause: Page takes too long to load Fix:
result = app.scrape_url(url, params={"waitFor": 10000}) # Wait 10s
Cause: Content loaded via JavaScript after initial render Fix:
result = app.scrape_url(url, params={
"waitFor": 5000,
"actions": [{"type": "wait", "milliseconds": 3000}]
})
Perform interactions before scraping:
result = app.scrape_url(
url="https://example.com",
params={
"actions": [
{"type": "click", "selector": "button.load-more"},
{"type": "wait", "milliseconds": 2000},
{"type": "scroll", "direction": "down"}
]
}
)
result = app.scrape_url(
url="https://example.com",
params={
"headers": {
"User-Agent": "Custom Bot 1.0",
"Accept-Language": "en-US"
}
}
)
Instead of polling, receive results via webhook:
crawl = app.crawl_url(
url="https://docs.example.com",
params={
"limit": 1000,
"webhook": "https://your-domain.com/webhook"
}
)
| Package | Version | Last Checked |
|---|---|---|
| firecrawl-py | 4.5.0+ | 2025-10-20 |
| @mendable/firecrawl-js (or firecrawl) | 4.4.1+ | 2025-10-24 |
| API Version | v2 | Current |
Note: The Node.js SDK requires Node.js >=22.0.0 and cannot run in Cloudflare Workers. Use direct REST API calls in Workers (see Cloudflare Workers Integration section).
Token Savings: ~60% vs manual integration Error Prevention: API authentication, rate limiting, format handling Production Ready: ✅
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.