From rest-api-pipeline
Adds a new REST API endpoint or resource to an existing dlt pipeline. Use when extending data pulls from an API with a working pipeline.
npx claudepluginhub dlt-hub/dlthub-ai-workbench --plugin rest-api-pipelineThis skill uses the workspace's default tool permissions.
Add a new resource to an existing dlt REST API pipeline source.
Scaffolds minimal dlt REST API pipeline via dlt init command for rest_api core source or generic HTTP APIs. Excludes sql_database/filesystem sources.
Build a data pipeline — ETL/ELT with extraction, transformation, loading, error handling, and scheduling. Use when asked to "build ETL", "data pipeline", "move data from X to Y", or "sync data".
Designs data pipelines and ETL processes covering extraction, transformation, loading, data quality checks, orchestration, and patterns for batch, streaming, CDC, ELT. Useful for building pipelines, data flows, syncing, or moving data between systems.
Share bugs, ideas, or general feedback.
Add a new resource to an existing dlt REST API pipeline source.
Parse $ARGUMENTS:
endpoint-description (required): what data the user wants to add (e.g., "claude code analytics", "user profiles", "transactions")Read the pipeline .py file to understand:
@dlt.source function and its parametersRESTAPIConfig: client setup (base_url, auth, paginator)__main__ runs the pipeline (dev_mode, add_limit, with_resources)Note what patterns the existing resources use:
replace, merge, append)If a docs.yaml scaffold exists, read it for endpoint details.
Web search the API documentation for the new endpoint:
Read dlt docs if you need to refresh on config options:
https://dlthub.com/docs/dlt-ecosystem/verified-sources/rest_api/basic.mdhttps://dlthub.com/docs/general-usage/resource.mdIf the endpoint fits the existing client config (base_url, auth, paginator), add it to the "resources" list in RESTAPIConfig. Key decisions:
client.paginator — override per-resource if differentmap/filter/yield_map if needed (e.g., Decimal for money — NEVER float)Some endpoints can't be described in RESTAPIConfig:
Define a custom @dlt.resource inside the @dlt.source function. Use RESTClient (from dlt.sources.helpers.rest_client) for HTTP calls with built-in auth and pagination. Loop over dates (or other dimensions) in the resource, call client.paginate() for each, and yield the data. The source then yields both rest_api_resources(config) and the custom resource.
Read dlt docs on RESTClient: https://dlthub.com/docs/general-usage/http/rest-client.md
Update the source docstring to list the new resource and show with_resources() examples.
Use debug-pipeline after each run to inspect traces and load packages.
Use with_resources() to load only the new resource:
source = my_source()
pipeline.run(source.with_resources("new_resource").add_limit(1))
Temporarily edit __main__ or run from a Python shell. This avoids re-loading all existing resources while iterating.
Run the pipeline:
uv run python <source>_pipeline.py
Iterate over the resource directly without loading to destination:
for item in source.resources["my_resource"]:
print(item)
Now Use debug-pipeline skill with the tricks above!
Check if the existing pipeline uses patterns that the new resource should also adopt:
incremental config? Should the new one too?merge used with primary_key? Should the new resource also merge?columns for nullable fields?Flag any gaps to the user — the new resource works now but may need these patterns for production use.
After adding, use validate-data to verify schema and data look correct.
Endpoint added: <resource_name>
- Path: <endpoint_path>
- Tables created: <list of tables including child tables>
Load with:
source.with_resources("<resource_name>") # just this endpoint
source # all endpoints
Available resources: <list all resource names>