Disclaimer: This is a personal project, built and maintained in my own time. It is not affiliated with, endorsed by, or supported by any employer or vendor. No warranty — use it at your own risk.

duckrun is a dbt adapter that runs your model SQL in DuckDB and writes the results to Delta Lake using delta_rs (the deltalake Python package). duckrun itself is just glue — it owns none of the heavy lifting. The real work is done by DuckDB (executes the SQL), delta-rs (writes the Delta table), Arrow (the zero-copy (kind of) bridge that hands query results from DuckDB to delta-rs), and dbt (orchestrates the DAG). DuckDB is here for convenience as the SQL engine; the materialization is all delta-rs and Arrow.

It is a thin wrapper around dbt-duckdb. You keep everything dbt-duckdb gives you — views, seeds, sources, tests, snapshots, the full plugin ecosystem — and gain one extra thing: a Delta-backed table / incremental materialization that writes real Delta tables.

The design rationale — why delta_rs and not DuckDB's native Delta writer, why Delta and not Iceberg, why a separate adapter — lives in docs/design_document.md.

How it fits together

DuckDB is a great query engine, Delta Lake is a great open table format, and dbt is the right tool to orchestrate the DAG. duckrun wires the three together:

DuckDB executes · delta_rs materializes · dbt orchestrates.

Install

pip install duckrun

That single install pulls in dbt-duckdb (and therefore duckdb) plus deltalake.

Configure your profile

# ~/.dbt/profiles.yml
my_project:
  target: dev
  outputs:
    dev:
      type: duckrun
      # No `threads:` needed — duckrun always runs single-threaded.
      # DuckDB runs in-memory by default — the Delta tables are the only state.
      # Default Delta location for models that don't set config(location=...).
      root_path: './warehouse'   # local path, or s3://..., gs://..., abfss://...
      # storage_options: {}      # passed through to deltalake for remote stores

Persisted models are written to <root_path>/<schema>/<model> (e.g. ./warehouse/dbo/orders), or to an explicit config(location=...).

Fabric Lakehouse without a schema

A schema-less Lakehouse (tables straight under Tables/, no Tables/<schema>/ grouping) is a bad pattern — you lose the namespace that keeps a warehouse organized — but if you're stuck with one, no special config is needed. Drop the trailing Tables from root_path and let the schema fill that slot:

      schema: Tables
      root_path: "abfss://<ws>@onelake.dfs.fabric.microsoft.com/<lh>.Lakehouse"

Since models are written to <root_path>/<schema>/<model>, this lands them at <lh>.Lakehouse/Tables/<model> — exactly the flat layout the schema-less Lakehouse expects. Prefer a schema-enabled Lakehouse (root_path: .../Tables, real schemas) whenever you can.

Remote stores (S3 / GCS / ADLS)

Point root_path at the warehouse location and pass credentials through storage_options — these flow straight to deltalake for writes and merges.

On Azure-backed stores, if storage_options carries a bearer_token (or token / access_token), the adapter also auto-creates a matching DuckDB Azure secret, so delta_scan() reads work with no extra config. In a notebook where the storage secret is already provided to DuckDB, you can leave storage_options empty.

    remote:
      type: duckrun
      schema: dbo
      root_path: "s3://my-bucket/warehouse"   # or abfss://... , gs://...
      storage_options:
        aws_access_key_id: "{{ env_var('AWS_ACCESS_KEY_ID') }}"
        aws_secret_access_key: "{{ env_var('AWS_SECRET_ACCESS_KEY') }}"

Verified end-to-end against real remote object storage: table overwrite, incremental merge, and delta_scan reads / tests.

duckrun-projects

Popularity

Health & Quality

What's Inside

README

How it fits together

Install

Configure your profile

Fabric Lakehouse without a schema

Remote stores (S3 / GCS / ADLS)

Materializations

Confidence

Similar Plugins

dbt-migration

dak

agentspec

bauplan