Web scraping and automation platform with pre-built Actors for common tasks
/plugin marketplace add vm0-ai/api0/plugin install api0@api0This skill inherits all available tools. When active, it can use any tool Claude has access to.
Web scraping and automation platform. Run pre-built Actors (scrapers) or create your own. Access thousands of ready-to-use scrapers for popular websites.
Official docs: https://docs.apify.com/api/v2
Use this skill when you need to:
Set environment variable:
export APIFY_API_TOKEN="apify_api_xxxxxxxxxxxxxxxxxxxxxxxx"
Important: When using
$VARin a command that pipes to another command, wrap the command containing$VARinbash -c '...'. Due to a Claude Code bug, environment variables are silently cleared when pipes are used directly.bash -c 'curl -s "https://api.example.com" -H "Authorization: Bearer $API_KEY"' | jq .
Start an Actor run asynchronously:
bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"startUrls": [{"url": "https://example.com"}],
"maxPagesPerCrawl": 10,
"pageFunction": "async function pageFunction(context) { const { request, log, jQuery } = context; const $ = jQuery; const title = $(\"title\").text(); return { url: request.url, title }; }"
}'"'"' | jq .'
Response contains id (run ID) and defaultDatasetId for fetching results.
Wait for completion and get results directly (max 5 min):
bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"startUrls": [{"url": "https://news.ycombinator.com"}],
"maxPagesPerCrawl": 1,
"pageFunction": "async function pageFunction(context) { const { request, log, jQuery } = context; const $ = jQuery; const title = $(\"title\").text(); return { url: request.url, title }; }"
}'"'"' | jq .'
⚠️ Important: The
{runId}below is a placeholder - replace it with the actual run ID from your async run response (found in.data.id). See the complete workflow example below.
Poll the run status:
# Replace {runId} with actual ID like "HG7ML7M8z78YcAPEB"
bash -c 'curl -s "https://api.apify.com/v2/actor-runs/{runId}" --header "Authorization: Bearer ${APIFY_API_TOKEN}"' | jq '.data.status'
Complete workflow example (capture run ID and check status):
# Step 1: Start an async run and capture the run ID
RUN_ID=$(bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"startUrls": [{"url": "https://example.com"}],
"maxPagesPerCrawl": 10
}'"'"'' | jq -r '.data.id')
# Step 2: Check the run status
bash -c "curl -s \"https://api.apify.com/v2/actor-runs/${RUN_ID}\" --header \"Authorization: Bearer \${APIFY_API_TOKEN}\"" | jq '.data.status'
Statuses: READY, RUNNING, SUCCEEDED, FAILED, ABORTED, TIMED-OUT
⚠️ Important: The
{datasetId}below is a placeholder - do not use it literally! You must replace it with the actual dataset ID from your run response (found in.data.defaultDatasetId). See the complete workflow example below for how to capture and use the real ID.
Fetch results from a completed run:
# Replace {datasetId} with actual ID like "WkzbQMuFYuamGv3YF"
bash -c 'curl -s "https://api.apify.com/v2/datasets/{datasetId}/items" --header "Authorization: Bearer ${APIFY_API_TOKEN}"' | jq .
Complete workflow example (run async, wait, and fetch results):
# Step 1: Start async run and capture IDs
RESPONSE=$(bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"startUrls": [{"url": "https://example.com"}],
"maxPagesPerCrawl": 10
}'"'"'')
RUN_ID=$(echo "$RESPONSE" | jq -r '.data.id')
DATASET_ID=$(echo "$RESPONSE" | jq -r '.data.defaultDatasetId')
# Step 2: Wait for completion (poll status)
while true; do
STATUS=$(bash -c "curl -s \"https://api.apify.com/v2/actor-runs/${RUN_ID}\" --header \"Authorization: Bearer \${APIFY_API_TOKEN}\"" | jq -r '.data.status')
echo "Status: $STATUS"
[[ "$STATUS" == "SUCCEEDED" ]] && break
[[ "$STATUS" == "FAILED" || "$STATUS" == "ABORTED" ]] && exit 1
sleep 5
done
# Step 3: Fetch the dataset items
bash -c "curl -s \"https://api.apify.com/v2/datasets/${DATASET_ID}/items\" --header \"Authorization: Bearer \${APIFY_API_TOKEN}\"" | jq .
With pagination:
# Replace {datasetId} with actual ID
bash -c 'curl -s "https://api.apify.com/v2/datasets/{datasetId}/items?limit=100&offset=0" --header "Authorization: Bearer ${APIFY_API_TOKEN}"' | jq .
bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"queries": "web scraping tools",
"maxPagesPerQuery": 1,
"resultsPerPage": 10
}'"'"' | jq .'
bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~website-content-crawler/run-sync-get-dataset-items?timeout=300" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"startUrls": [{"url": "https://docs.example.com"}],
"maxCrawlPages": 10,
"crawlerType": "cheerio"
}'"'"' | jq .'
bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~instagram-scraper/runs" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"directUrls": ["https://www.instagram.com/apaborotnikov/"],
"resultsType": "posts",
"resultsLimit": 10
}'"'"' | jq .'
bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/junglee~amazon-crawler/runs" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"categoryOrProductUrls": [{"url": "https://www.amazon.com/dp/B0BSHF7WHW"}],
"maxItemsPerStartUrl": 1
}'"'"' | jq .'
Get recent Actor runs:
bash -c 'curl -s "https://api.apify.com/v2/actor-runs?limit=10&desc=true" --header "Authorization: Bearer ${APIFY_API_TOKEN}"' | jq '.data.items[] | {id, actId, status, startedAt}
⚠️ Important: The
{runId}below is a placeholder - replace it with the actual run ID. See the complete workflow example below.
Stop a running Actor:
# Replace {runId} with actual ID like "HG7ML7M8z78YcAPEB"
bash -c 'curl -s -X POST "https://api.apify.com/v2/actor-runs/{runId}/abort" --header "Authorization: Bearer ${APIFY_API_TOKEN}"' | jq .
Complete workflow example (start a run and abort it):
# Step 1: Start an async run and capture the run ID
RUN_ID=$(bash -c 'curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" --header "Authorization: Bearer ${APIFY_API_TOKEN}" --header "Content-Type: application/json" -d '"'"'{
"startUrls": [{"url": "https://example.com"}],
"maxPagesPerCrawl": 100
}'"'"'' | jq -r '.data.id')
echo "Started run: $RUN_ID"
# Step 2: Abort the run
bash -c "curl -s -X POST \"https://api.apify.com/v2/actor-runs/${RUN_ID}/abort\" --header \"Authorization: Bearer \${APIFY_API_TOKEN}\"" | jq .
Browse public Actors:
bash -c 'curl -s "https://api.apify.com/v2/store?limit=20&category=ECOMMERCE" --header "Authorization: Bearer ${APIFY_API_TOKEN}"' | jq '.data.items[] | {name, username, title}
| Actor ID | Description |
|---|---|
apify/web-scraper | General web scraper |
apify/website-content-crawler | Crawl entire websites |
apify/google-search-scraper | Google search results |
apify/instagram-scraper | Instagram posts/profiles |
junglee/amazon-crawler | Amazon products |
apify/twitter-scraper | Twitter/X posts |
apify/youtube-scraper | YouTube videos |
apify/linkedin-scraper | LinkedIn profiles |
lukaskrivka/google-maps | Google Maps places |
Find more at: https://apify.com/store
| Parameter | Type | Description |
|---|---|---|
timeout | number | Run timeout in seconds |
memory | number | Memory in MB (128, 256, 512, 1024, 2048, 4096) |
maxItems | number | Max items to return (for sync endpoints) |
build | string | Actor build tag (default: "latest") |
waitForFinish | number | Wait time in seconds (for async runs) |
Run object:
{
"data": {
"id": "HG7ML7M8z78YcAPEB",
"actId": "HDSasDasz78YcAPEB",
"status": "SUCCEEDED",
"startedAt": "2024-01-01T00:00:00.000Z",
"finishedAt": "2024-01-01T00:01:00.000Z",
"defaultDatasetId": "WkzbQMuFYuamGv3YF",
"defaultKeyValueStoreId": "tbhFDFDh78YcAPEB"
}
}
run-sync-get-dataset-items for quick tasks (<5 min), async for longer jobslimit and offset for large datasets