Help us improve
Share bugs, ideas, or general feedback.
From antigravity-awesome-skills
Extracts public web data and search engine results via HasData APIs—SERP scraping, AI mode, Maps leads, and structured endpoints for ecommerce, travel, and local business data.
npx claudepluginhub sickn33/antigravity-awesome-skills --plugin antigravity-bundle-aas-mobile-app-builderHow this skill is triggered — by the user, by Claude, or both
Slash command
/antigravity-awesome-skills:hasdataThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Cloud platform for extracting public web data. One API key, three execution modes. All endpoints sit under `https://api.hasdata.com` and authenticate with `x-api-key`.
CLI tool for searching, scraping, and retrieving structured data from websites (Google, Amazon, YouTube, Zillow, travel, jobs, etc.). Useful for ad-hoc data collection or script automation.
Extracts web data from platforms (Amazon, LinkedIn, Instagram, etc.) and generic sites using the Bright Data Python SDK. Covers scraping, search, datasets, browser automation.
Scrapes data from 15+ platforms (Instagram, TikTok, LinkedIn, Google Maps, etc.) via Apify CLI Actors. Use for lead generation, competitor analysis, brand monitoring, influencer discovery, and SEO intelligence.
Share bugs, ideas, or general feedback.
Cloud platform for extracting public web data. One API key, three execution modes. All endpoints sit under https://api.hasdata.com and authenticate with x-api-key.
curl -G 'https://api.hasdata.com/scrape/google/serp' \
--data-urlencode 'q=coffee' \
-H 'x-api-key: <your-api-key>'
401 invalid key, 403 quota exhausted, 429 concurrency cap, 500 server error (retry).
Use this skill when:
| Mode | Latency | When | Endpoint |
|---|---|---|---|
| Web Scraping API | seconds | Arbitrary URL — JS rendering, CSS/AI extraction, screenshots | POST /scrape/web |
| Scraper APIs (sync) | seconds | Pre-parsed JSON for known platforms (Google, Amazon, Zillow, …) | GET /scrape/<vertical>/<resource> |
| Scraper Jobs (async) | minutes–hours | Bulk extraction, recursive crawling, webhook fan-out | POST /scrapers/<slug>/jobs |
Decision rule. Default to a Scraper API when one exists for the platform (pre-parsed JSON, no selector maintenance). Use Web Scraping for arbitrary URLs not covered by an API. Reach for a Scraper Job only when no API equivalent exists — crawler, contacts, sec-edgar, amazon-bestsellers, amazon-product-reviews — or when async fan-out + webhooks save engineering time over a paginated client loop.
{ "requestMetadata": { "id": "…", "status": "ok", "url": "…" }, "...": "endpoint-specific" }
Treat data as valid only if requestMetadata.status === "ok". HTTP 200 alone isn't enough.
/scrape/google/ai-mode for the answer + references → /scrape/web (markdown) on each reference URL → cited RAG context, no vector DB./scrape/google-maps/search returns business websites and phones; collect contact details only from public, permitted sources and apply opt-out, rate, and privacy-law constraints before any outreach use.crawler Scraper Job with outputFormat: ["markdown"] + includePaths: "/docs/.+" produces an LLM-ready corpus in one submission.knowledgeGraph, localResults, inlineShoppingResults, relatedQuestions carry pre-parsed public facts. Always check them before considering direct page access.x-api-key header on every request. Read from HASDATA_API_KEY env. Never hardcode, never log.429 and 5xx only — exponential backoff, jitter. Never retry 4xx (auth, validation).429s.body.id (integer), not jobId. Persist it immediately. Poll GET /scrapers/jobs/<id> every 10–30 s with backoff; treat webhooks as best-effort and always pair with polling. On finished the status carries data: {csv, json, xlsx} short-lived URLs — download immediately.See references/code-recipes.md for ready-to-paste Python and TypeScript clients with retry, backoff, bounded concurrency, and the full job lifecycle.
jsRendering first, enable only if the page needs it — most static pages parse fine without a headless browser.cookies parameter — cookies go through headers["Cookie"].includePaths regex is case-sensitive. /blog/.+ won't match /Blog/....data is double-wrapped. Each row is body.data[i].data; outer wraps with id, jobId, dataId, createdAt, updatedAt.requestMetadata.status === "ok" is the only success signal. HTTP 200 alone isn't enough.references/web-scraping.md — POST /scrape/web parameters, JS scenarios, AI extraction, cookie auth.references/search.md — Google SERP / Light / AI Mode / News / Shopping / Bing / Trends + pagination.references/ecommerce.md — Amazon (product, search, seller, seller-products) and Shopify.references/real-estate.md — Zillow, Redfin (bracketed filters).references/travel.md — Airbnb, Booking, Google Flights (occupancy rules, token pagination, IATA codes).references/local-business.md — Maps (search/place/reviews/photos/posts), Yelp, YellowPages.references/jobs.md — Indeed and Glassdoor.references/youtube.md — YouTube search / video / channel / transcript.references/scraper-jobs.md — async submit/poll/results, Crawler, Contacts, SEC EDGAR, webhook receiver.references/code-recipes.md — Python / TypeScript clients with retry, backoff, concurrency, polling.