Skill

bright-data-best-practices

Documents production-ready Bright Data integrations: CLI setup, authentication, and best practices for Web Unlocker, SERP, Web Scraper, and Browser APIs.

automation

Popularity

Stars

201

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/brightdata-plugin:bright-data-best-practices

Not user invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Install, authentication, and troubleshooting for the Bright Data CLI (`bdata`) are documented in a single canonical place:

Supporting Files

references/browser-api.mdreferences/cli-setup.mdreferences/serp-api.mdreferences/web-scraper-api.mdreferences/web-unlocker.md

SKILL.md

382 lines · ~3.6k tokens

Stats

LanguageShell

Stars201

Forks30

MaintenanceExcellent

Last CommitJun 11, 2026

Actions

View Source View Plugin View on GitHub View README

CLI Setup Reference

Install, authentication, and troubleshooting for the Bright Data CLI (bdata) are documented in a single canonical place:

→ references/cli-setup.md

Consult it before any task that shells out to bdata.

Bright Data APIs

Bright Data provides infrastructure for web data extraction at scale. Four primary APIs cover different use cases — always pick the most specific tool for the job.

Choosing the Right API

Use Case	API	Why
Scrape any webpage by URL (no interaction)	Web Unlocker	HTTP-based, auto-bypasses bot detection, cheapest
Google / Bing / Yandex search results	SERP API	Specialized for SERP extraction, returns structured data
Structured data from Amazon, LinkedIn, Instagram, TikTok, etc.	Web Scraper API	Pre-built scrapers, no parsing needed
Click, scroll, fill forms, run JS, intercept XHR	Browser API	Full browser automation
Puppeteer / Playwright / Selenium automation	Browser API	Connects via CDP/WebDriver
Route your own HTTP client through a raw proxy (DC/ISP/Residential/Mobile)	Proxy networks	When you need direct proxy access with your own request logic instead of a managed API — see the `proxy.md` skill

Authentication Pattern (All APIs)

All APIs share the same authentication model. The env vars below apply to direct REST API integrations — if you are using the bdata CLI, bdata login handles all of these automatically (see references/cli-setup.md).

export BRIGHTDATA_API_KEY="your-api-key"         # From Control Panel > Account Settings
export BRIGHTDATA_UNLOCKER_ZONE="zone-name"       # Web Unlocker zone name
export BRIGHTDATA_SERP_ZONE="serp-zone-name"      # SERP API zone name
export BROWSER_AUTH="brd-customer-ID-zone-NAME:PASSWORD"  # Browser API credentials

REST API authentication header for Web Unlocker and SERP API:

Authorization: Bearer YOUR_API_KEY

Web Unlocker API

HTTP-based scraping proxy. Best for simple page fetches without browser interaction.

Endpoint: POST https://api.brightdata.com/request

import requests

response = requests.post(
    "https://api.brightdata.com/request",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "zone": "YOUR_ZONE_NAME",
        "url": "https://example.com/product/123",
        "format": "raw"
    }
)
html = response.text

Key Parameters

Parameter	Type	Description
`zone`	string	Zone name (required)
`url`	string	Target URL with `http://` or `https://` (required)
`format`	string	`"raw"` (HTML) or `"json"` (structured wrapper) (required)
`method`	string	HTTP verb, default `"GET"`
`country`	string	2-letter ISO for geo-targeting (e.g., `"us"`, `"de"`)
`data_format`	string	Transform: `"markdown"` or `"screenshot"`
`async`	boolean	`true` for async mode

Quick Patterns

# Get markdown (best for LLM input)
response = requests.post(
    "https://api.brightdata.com/request",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"zone": ZONE, "url": url, "format": "raw", "data_format": "markdown"}
)

# Geo-targeted request
json={"zone": ZONE, "url": url, "format": "raw", "country": "de"}

# Screenshot for debugging
json={"zone": ZONE, "url": url, "format": "raw", "data_format": "screenshot"}

# Async for bulk processing
json={"zone": ZONE, "url": url, "format": "raw", "async": True}

Critical rule: Never use Web Unlocker with Puppeteer, Playwright, Selenium, or anti-detect browsers. Use Browser API instead.

See references/web-unlocker.md for complete reference including proxy interface, special headers, async flow, features, and billing.

SERP API

Structured search engine result extraction for Google, Bing, Yandex, DuckDuckGo.

Endpoint: POST https://api.brightdata.com/request (same as Web Unlocker)

response = requests.post(
    "https://api.brightdata.com/request",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "zone": "YOUR_SERP_ZONE",
        "url": "https://www.google.com/search?q=python+web+scraping&brd_json=1&gl=us&hl=en",
        "format": "raw"
    }
)
data = response.json()
for result in data.get("organic", []):
    print(result["rank"], result["title"], result["link"])

Essential Google URL Parameters

Parameter	Description	Example
`q`	Search query	`q=python+web+scraping`
`brd_json`	Parsed JSON output	`brd_json=1` (always use for data pipelines)
`gl`	Country for search	`gl=us`
`hl`	Language	`hl=en`
`start`	Pagination offset	`start=10` (page 2), `start=20` (page 3)
`tbm`	Search type	`tbm=nws` (news), `tbm=isch` (images), `tbm=vid` (videos)
`brd_mobile`	Device	`brd_mobile=1` (mobile), `brd_mobile=ios`
`brd_browser`	Browser	`brd_browser=chrome`
`brd_ai_overview`	Trigger AI Overview	`brd_ai_overview=2`
`uule`	Encoded geo location	for precise location targeting

Note: num parameter is deprecated as of September 2025. Use start for pagination.

Parsed JSON Response Structure

{
  "organic": [{"rank": 1, "global_rank": 1, "title": "...", "link": "...", "description": "..."}],
  "paid": [],
  "people_also_ask": [],
  "knowledge_graph": {},
  "related_searches": [],
  "general": {"results_cnt": 1240000000, "query": "..."}
}

Bing Key Parameters

Parameter	Description
`q`	Search query
`setLang`	Language (prefer 4-letter: `en-US`)
`cc`	Country code
`first`	Pagination (increment by 10: 1, 11, 21...)
`safesearch`	`off`, `moderate`, `strict`
`brd_mobile`	Device type

Async for Bulk SERP

# Submit
response = requests.post(
    "https://api.brightdata.com/request",
    params={"async": "1"},
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"zone": SERP_ZONE, "url": "https://www.google.com/search?q=test&brd_json=1", "format": "raw"}
)
response_id = response.headers.get("x-response-id")

# Retrieve (retrieve calls are NOT billed)
result = requests.get(
    "https://api.brightdata.com/serp/get_result",
    params={"response_id": response_id},
    headers={"Authorization": f"Bearer {API_KEY}"}
)

Billing: Pay per 1,000 successful requests only. Async retrieve calls are not billed.

See references/serp-api.md for complete reference including Maps, Trends, Reviews, Lens, Hotels, Flights parameters.

Web Scraper API

Pre-built scrapers for structured data extraction from 100+ platforms. No parsing logic needed.

Sync Endpoint: POST https://api.brightdata.com/datasets/v3/scrape Async Endpoint: POST https://api.brightdata.com/datasets/v3/trigger

# Sync (up to 20 URLs, returns immediately)
response = requests.post(
    "https://api.brightdata.com/datasets/v3/scrape",
    params={"dataset_id": "YOUR_DATASET_ID", "format": "json"},
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"input": [{"url": "https://www.amazon.com/dp/B09X7M8TBQ"}]}
)

if response.status_code == 200:
    data = response.json()  # Results ready
elif response.status_code == 202:
    snapshot_id = response.json()["snapshot_id"]  # Poll for completion

Parameters

Parameter	Type	Description
`dataset_id`	string	Scraper identifier from the Scraper Library (required)
`format`	string	`json` (default), `ndjson`, `jsonl`, `csv`
`custom_output_fields`	string	Pipe-separated fields: `url\|title\|price`
`include_errors`	boolean	Include error info in results

Request Body

{
  "input": [
    { "url": "https://www.amazon.com/dp/B09X7M8TBQ" },
    { "url": "https://www.amazon.com/dp/B0B7CTCPKN" }
  ]
}

Poll for Async Results

import time

# Trigger
snapshot_id = requests.post(
    "https://api.brightdata.com/datasets/v3/trigger",
    params={"dataset_id": DATASET_ID, "format": "json"},
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"input": [{"url": u} for u in urls]}
).json()["snapshot_id"]

# Poll
while True:
    status = requests.get(
        f"https://api.brightdata.com/datasets/v3/progress/{snapshot_id}",
        headers={"Authorization": f"Bearer {API_KEY}"}
    ).json()["status"]

    if status == "ready": break
    if status == "failed": raise Exception("Job failed")
    time.sleep(10)

# Download
data = requests.get(
    f"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}",
    params={"format": "json"},
    headers={"Authorization": f"Bearer {API_KEY}"}
).json()

Progress status values: starting → running → ready | failed Data retention: 30 days. Billing: Per delivered record. Invalid input URLs that fail are still billable.

See references/web-scraper-api.md for complete reference including scraper types, output formats, delivery options, and billing details.

Browser API (Scraping Browser)

Full browser automation via CDP/WebDriver. Handles CAPTCHA, fingerprinting, and anti-bot detection automatically.

Connection:

Playwright/Puppeteer: wss://${AUTH}@brd.superproxy.io:9222
Selenium: https://${AUTH}@brd.superproxy.io:9515

const { chromium } = require("playwright-core");

const AUTH = process.env.BROWSER_AUTH;
const browser = await chromium.connectOverCDP(`wss://${AUTH}@brd.superproxy.io:9222`);
const page = await browser.newPage();
page.setDefaultNavigationTimeout(120000); // Always set to 2 minutes

await page.goto("https://example.com", { waitUntil: "domcontentloaded" });
const html = await page.content();
await browser.close();

from playwright.async_api import async_playwright

async with async_playwright() as p:
    browser = await p.chromium.connect_over_cdp(f"wss://{AUTH}@brd.superproxy.io:9222")
    page = await browser.new_page()
    page.set_default_navigation_timeout(120000)
    await page.goto("https://example.com", wait_until="domcontentloaded")
    html = await page.content()
    await browser.close()

Custom CDP Functions

Function	Purpose
`Captcha.solve`	Manually trigger CAPTCHA solving
`Captcha.setAutoSolve`	Enable/disable auto CAPTCHA solving
`Proxy.setLocation`	Set precise geo location (call BEFORE goto)
`Proxy.useSession`	Maintain same IP across sessions
`Emulation.setDevice`	Apply device profile (iPhone 14, etc.)
`Emulation.getSupportedDevices`	List available device profiles
`Unblocker.enableAdBlock`	Block ads to save bandwidth
`Unblocker.disableAdBlock`	Re-enable ads
`Input.type`	Fast text input for bulk form filling
`Browser.addCertificate`	Install client SSL cert for session
`Page.inspect`	Get DevTools debug URL for live session

// CDP session pattern for custom functions
const client = await page.target().createCDPSession();

// CAPTCHA solve with timeout
const result = await client.send("Captcha.solve", { timeout: 30000 });

// Precise geo location (must be before goto)
await client.send("Proxy.setLocation", {
  latitude: 37.7749,
  longitude: -122.4194,
  distance: 10,
  strict: true
});

// Block unnecessary resources
await client.send("Network.setBlockedURLs", { urls: ["*google-analytics*", "*.ads.*"] });

// Device emulation
await client.send("Emulation.setDevice", { deviceName: "iPhone 14" });

Session Rules

One initial navigation per session — new URL = new session
Idle timeout: 5 minutes
Max duration: 30 minutes

Geolocation

Country-level: append -country-us to credentials username
EU-wide: append -country-eu (routes through 29+ European countries)
Precise: use Proxy.setLocation CDP command (before navigation)

Error Codes

Code	Issue	Fix
`407`	Wrong port	Playwright/Puppeteer → `9222`, Selenium → `9515`
`403`	Bad auth	Check credentials format and zone type
`503`	Service scaling	Wait 1 minute, reconnect

Billing: Traffic-based only. Block images/CSS/fonts to reduce costs.

See references/browser-api.md for complete reference including all CDP functions, bandwidth optimization, CAPTCHA patterns, and debugging.

Detailed References

references/web-unlocker.md — Web Unlocker: full parameter list, proxy interface, special headers, async flow, features, billing, anti-patterns
references/serp-api.md — SERP API: all Google params (Maps, Trends, Reviews, Lens, Hotels, Flights), Bing params, parsed JSON structure, async, billing
references/web-scraper-api.md — Web Scraper API: sync vs async, all parameters, polling, scraper types, output formats, billing
references/browser-api.md — Browser API: connection strings, session rules, all CDP functions, geo-targeting, bandwidth optimization, CAPTCHA, debugging, error codes

Related Skills

brightdata-proxy — For routing requests through Bright Data's raw proxy networks (Datacenter, ISP, Residential, Mobile) with your own HTTP client instead of a managed API. Covers network/IP-pool selection, the brd-customer-... username format, targeting & sticky-session params, SSL CA setup for Residential/Mobile, and integrations for cURL, Python (requests/httpx/aiohttp/Scrapy), Node (fetch/axios), Playwright, Puppeteer, and Selenium. Hand off to it whenever the task is raw proxy access rather than Web Unlocker / SERP / Web Scraper / Browser API. Escalation order when proxies hit consistent blocks: raw proxy → Web Unlocker → Browser API → Web Scraper API.

bright-data-best-practices

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

bright-data-best-practices

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

CLI Setup Reference

Bright Data APIs

Choosing the Right API

Authentication Pattern (All APIs)

Web Unlocker API

Key Parameters

Quick Patterns

SERP API

Essential Google URL Parameters

Parsed JSON Response Structure

Bing Key Parameters

Async for Bulk SERP

Web Scraper API

Parameters

Request Body

Poll for Async Results

Browser API (Scraping Browser)

Custom CDP Functions

Session Rules

Geolocation

Error Codes

Detailed References

Related Skills

Similar Skills

CLI Setup Reference

Bright Data APIs

Choosing the Right API

Authentication Pattern (All APIs)

Web Unlocker API

Key Parameters

Quick Patterns

SERP API

Essential Google URL Parameters

Parsed JSON Response Structure

Bing Key Parameters

Async for Bulk SERP

Web Scraper API

Parameters

Request Body

Poll for Async Results

Browser API (Scraping Browser)

Custom CDP Functions

Session Rules

Geolocation

Error Codes

Detailed References

Related Skills

Similar Skills