Help us improve
Share bugs, ideas, or general feedback.
From data-engineering
Use when adding or reviewing data quality tests for dbt models in warehouse-backed analytics projects. Covers dbt generic tests, singular tests (assert_*.sql), accepted-values macros, dbt_utils patterns, cross-system consistency tests, and warehouse-oriented validation. Use when writing data tests, creating assert_*.sql files, testing business logic, or validating referential integrity.
npx claudepluginhub yeaight7/agent-powerups --plugin data-engineeringHow this skill is triggered — by the user, by Claude, or both
Slash command
/data-engineering:data-qualityThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Testing patterns for dbt projects on analytical warehouses such as BigQuery. This workflow uses **dbt tests exclusively** — no external data-quality framework required.
Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
Testing patterns for dbt projects on analytical warehouses such as BigQuery. This workflow uses dbt tests exclusively — no external data-quality framework required.
Defined in .yml files alongside models. Use data_tests: (not tests:):
# models/core/shared/dim_teams.yml
models:
- name: dim_teams
columns:
- name: id
description: Surrogate key
data_tests:
- unique
- not_null
- name: natural_id
description: Natural key from source
data_tests:
- unique
- not_null
- name: plan_code
description: Subscription plan code
data_tests:
- accepted_values:
values: "{{ get_plan_code_values() }}" # Prefer macros, not hardcoded lists
Business logic tests live in tests/ as assert_*.sql files. They pass when they return zero rows:
-- tests/teams/assert_teams_created_before_deleted.sql
with teams as (
select * from {{ ref('dim_teams') }}
)
select
teams.natural_id,
teams.created_at,
teams.deleted_at
from teams
where teams.deleted_at < teams.created_at
-- tests/academy/assert_students_with_completed_modules_are_enrolled.sql
with completed as (
select distinct student_id from {{ ref('fct_academy_course_modules_completed') }}
),
enrolled as (
select distinct student_id from {{ ref('fct_academy_course_enrollments') }}
),
orphaned as (
select completed.student_id
from completed
left join enrolled
on completed.student_id = enrolled.student_id
where enrolled.student_id is null
)
select * from orphaned
Organize singular tests in subdirectories matching the domain: tests/accounts/, tests/billing/, tests/content/, etc.
Every model's primary key (id) must have both tests:
- name: id
data_tests:
- unique
- not_null
- name: dim_user_id
data_tests:
- not_null
- relationships:
to: ref('dim_users')
field: id
Use macros from macros/accepted_values/ instead of hardcoding values:
- name: plan_code
data_tests:
- accepted_values:
values: "{{ get_plan_code_values() }}"
Browse available macros: macros/accepted_values/get_*.sql
models:
- name: fct_team_members
data_tests:
# Table must not be empty
- dbt_utils.at_least_one:
column_name: id
# Data freshness — table updated in last 24h
- dbt_utils.recency:
datepart: hour
field: created_at
interval: 24
# Business rule on a measure
- dbt_utils.expression_is_true:
expression: "total_seats >= 0"
columns:
- name: team_id
data_tests:
# Cross-table: team must exist in dim_teams
- relationships:
to: ref('dim_teams')
field: natural_id
- name: end_at
data_tests:
# end_at must be after start_at when both are set
- dbt_utils.expression_is_true:
expression: "end_at is null or end_at > start_at"
Define reusable tests in tests/generic/ or use the project's existing test_utils:
-- tests/generic/test_no_future_dates.sql
{% test no_future_dates(model, column_name) %}
select {{ column_name }}
from {{ model }}
where {{ column_name }} > current_timestamp
{% endtest %}
Usage in YAML:
- name: created_at
data_tests:
- no_future_dates
Assert that data is consistent across source systems. These go in tests/ as singular tests:
-- tests/finance/assert_billing_plan_matches_team_plan.sql
-- Tests consistency between billing records and team plan data
with invoices as (
select * from {{ ref('fct_billing_invoices') }}
),
teams as (
select * from {{ ref('dim_teams') }}
),
mismatched as (
select
invoices.invoice_id,
invoices.plan_code as invoice_plan,
teams.plan_code as team_plan
from invoices
inner join teams
on invoices.team_id = teams.natural_id
where invoices.plan_code != teams.plan_code
and invoices.is_active
)
select * from mismatched
tests/
├── accounts/ # Tests for account and team domain models
├── adhoc/ # Ad-hoc spot-check tests
├── billing/ # Finance and billing consistency tests
├── content/ # Content or product-domain tests
└── generic/ # Reusable generic test definitions
dbt test # All tests
dbt test -s <model> # Tests for one model
dbt test -s tag:critical # Tests with a specific tag
dbt build -s <model> # Run + test together
dbt test --store-failures # Persist failures to BQ for inspection
To inspect test failures in BigQuery (development), e.g.:
select * from `dbt_<username>`.`dbt_test__audit`.<test_name>
For every new model:
id: unique + not_nullnatural_id: unique + not_nullrelationships testaccepted_values using a macrodbt_utils.expression_is_true or singular testsid test catches duplicatesdbt_utils.recency for frequently updated tablesmacros/accepted_values/ — never hardcode enum lists in YAML--store-failures in development to inspect what went wrong