dltHub AI Workbench
dlt (data load tool) is an open-source Python library for loading data from APIs and databases into a warehouse or lakehouse. dltHub (paid platform) extends dlt with enterprise-grade features tailored to the needs of coding agents: transformations, data quality validation, managed runtime infrastructure, managed data apps, and an AI-powered workspace environment.

The dltHub AI Workbench is a collection of toolkits that give AI coding assistants step-by-step workflows to build data pipelines with dlt. You can use the workbench as-is or fork and customize it for your own stack. The dlt ai CLI installs toolkit components into the right locations for your assistant and runs the workspace MCP server.
Build toolkits cover ingestion (REST API, SQL), transformation, and data quality; Run toolkits handle deployment and exploration. The REST API toolkit is backed by the dltHub context — over 9,700 source definitions the agent queries to find verified connectors before writing code.
The dltHub AI Workbench is tested with Claude Code, Cursor, and Codex and may work with other AI coding assistants. We recommend workings in accept edits (Claude) / --approval-mode (Codex) mode to review the changes and familiarizing with dlthub AI workflows when getting started with the dlthub AI workbench.
The dlthub AI workbench supports the iterative data engineering workflow
Building data pipelines is iterative and covers two major phases — ingestion and transformations — each following the same inner loop:
Build (local development)
- Develop the pipeline iteratively — for ingestion: first REST API endpoint, then additional endpoints; for transformation: data model first, then the full transformation pipeline
- Explore the loaded data and validate it after each step
- Loop back to refine until the pipeline is solid
Run (production)
- Deploy the ingestion or transformation pipeline to production
- Serve insights via data apps built on top of the loaded data
The outer loop connects the two phases: insights from the transformation and serving layer feed back into ingestion refinement. The workbench Build toolkits support the local development loop; the Run toolkits handle deployment and data apps.

dltHub AI Workbench Toolkits
The workbench gives your coding assistant toolkits — that contain a structured, guided workflow for a specific phase. Instead of generating ad-hoc code, the assistant follows a defined sequence of steps from start to finish.
A Toolkit contains skills, commands, rules, and an MCP server — tied together by a workflow that tells the assistant which skill to run at each step and how to leverage the MCP.
All toolkits depend on init for shared rules, secrets handling, and the MCP server. When using the dlt ai CLI, init is installed automatically as a dependency. When using the Claude marketplace, install the init plugin separately.

Toolkit components
| Component | What it is | When it runs |
|---|
| Skill | Step-by-step procedure the assistant follows | Triggered by user intent or explicitly with /skill-name |
| Command | A slash command for a specific action | User invokes with /toolkit:command |
| Rule | Always-on context (conventions, constraints) | Every session, automatically |
| Workflow | Ordered sequence of skills with a fixed entry point | Loaded as a rule — always active |
| MCP server | Exposes pipelines, tables, and secrets as tools | During a session, via MCP protocol |
| dltHub context | 9,700+ REST API source definitions with verified connectors and pipeline patterns | During source discovery, via search_dlthub_sources |
MCP tools
Two MCP servers give the agent structured context throughout the workflow to avoid the need for manual copy-pasting.
dlt-workspace-mcp (local, installed by dlt ai init) exposes: data inspection tools (list_tables, preview_table, execute_sql_query, get_row_counts, display_schema, get_local_pipeline_state), secrets tools (secrets_view_redacted, secrets_update_fragment), and toolkit discovery (list_toolkits, toolkit_info).
dltHub context (remote) provides search_dlthub_sources — used by the find-source skill to search 9,700+ REST API source definitions and return verified connectors with reference links before writing code.
Available toolkits