Skill

view-data

Queries and explores data loaded by dlt pipelines using Python, dlt dataset API, ReadableRelation, and ibis expressions. For table exploration, row counts, and ad-hoc reports.

Python

data-engineering

database

npx claudepluginhub dlt-hub/dlthub-ai-workbench --plugin rest-api-pipeline

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Query data loaded by a dlt pipeline using Python. Use in standalone scripts, inline code, or as the data access layer for reports.

SKILL.md

Similar Skills

explore-data

Connects to dlt pipelines, profiles tables, scans schemas, plans charts with ibis and altair, and outputs analysis_plan.md artifacts for data exploration and analysis.

2 files

data-exploration

validate-data

Validates dlt pipeline-loaded schemas and data: mermaid diagrams, dashboard/MCP queries, fixes types (Decimal for money), nested structures, missing columns.

rest-api-pipeline

databricks-pipelines

100

Develop Lakeflow Spark Declarative Pipelines (formerly Delta Live Tables) on Databricks. Use when building batch or streaming data pipelines with Python or SQL. Invoke BEFORE starting implementation.

20 files

databricks-skills

Stats

Parent Repo Stars19

Parent Repo Forks2

Last CommitMar 11, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

View pipeline data

Query data loaded by a dlt pipeline using Python. Use in standalone scripts, inline code, or as the data access layer for reports.

Parse $ARGUMENTS:

pipeline-name (optional): the dlt pipeline name. If omitted, infer from session context. If ambiguous, ask the user and stop.
hints (optional, after --): additional requirements or focus areas (e.g., -- show top users by spend)

Workspace Dashboard UI if just exploring

Tell the user to run Workspace Dashboard if no precise query or instructions were give, this assumes user wants to just look at the data. Otherwise

dlt pipeline <pipeline_name> show

This opens a browser with table schemas, row counts, and sample data.

dlt dataset API for ad hoc reports

Essential Reading:

https://dlthub.com/docs/general-usage/dataset-access/dataset.md
https://dlthub.com/docs/general-usage/dataset-access/ibis-backend.md

Use pipeline.dataset() to access loaded data. This is destination agnostic — works the same on duckdb, postgres, bigquery, etc. NEVER import destination libraries (like duckdb) directly.

Attach to pipeline and get dataset

import dlt
pipeline = dlt.attach("<pipeline_name>")
dataset = pipeline.dataset()

ReadableRelation (dlt native)

Think about it as a subset of ibis with slightly different syntax.

table = dataset["my_table"]
table.head().df()                              # first rows as pandas
table.select("id", "name").limit(50).arrow()   # select columns, arrow format
table.where("id", "in", [1, 2, 3]).df()        # parametric filter
table.select("amount").max().fetchscalar()      # scalar aggregate
dataset.row_counts().df()                       # row counts for all tables

Ibis expressions (preferred for complex queries)

t = dataset["my_table"].to_ibis()
expr = t.filter(t.amount > 100).group_by("category").aggregate(total=t.amount.sum())
dataset(expr).df()  # execute ibis expression via dataset

Ibis is lazy, composable, and destination agnostic. Key operations:

table.group_by("col").aggregate(total=table.col.sum()) — aggregation
table.filter(table.col > 0) — filtering
table.join(other, table.id == other.parent_id) — joins
table.order_by(ibis.desc("col")) — sorting
table.mutate(new_col=table.col * 100) — computed columns
table.select("col1", "col2") — column selection

Read ibis docs: https://ibis-project.org/reference/expression-collections

Joining parent/child tables

dlt creates child tables for nested data (e.g., my_table__results). Join on _dlt_id / _dlt_parent_id:

parent = dataset["my_table"].to_ibis()
child = dataset["my_table__results"].to_ibis()
joined = parent.join(child, parent._dlt_id == child._dlt_parent_id)

Raw SQL (when needed)

dataset("SELECT * FROM my_table WHERE amount > 100").df()

Custom charts and insights

If the user wants to create custom charts or generate insights from their data, install the data-exploration toolkit (dlt ai toolkit data-exploration install) and follow the workflow there.