From wire
Proactive skill for diagnosing dbt job failures and test errors. Auto-activates when encountering dbt errors, failed jobs, compilation issues, or test failures. Provides systematic diagnosis workflow with error classification, investigation steps, and resolution patterns.
npx claudepluginhub rittmananalytics/wire-plugin --plugin wireThis skill uses the workspace's default tool permissions.
This skill automatically activates when diagnosing dbt job failures, test errors, compilation issues, or runtime problems. It provides a systematic methodology for classifying errors, investigating root causes, and implementing fixes with preventive measures.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
This skill automatically activates when diagnosing dbt job failures, test errors, compilation issues, or runtime problems. It provides a systematic methodology for classifying errors, investigating root causes, and implementing fixes with preventive measures.
The goal is not just to fix the immediate error but to understand why it happened, fix the root cause, and add safeguards to prevent recurrence.
This skill should activate when users:
Keywords to watch for:
Activate when:
dbt build or dbt run command returns errorsrun_results.json or dbt Cloud run artifacts/wire:dbt-validate) reports failuresExample internal triggers:
Never modify a test to make it pass without understanding why it is failing.
This is the iron rule of dbt troubleshooting. A failing test is a signal. Before changing anything:
Tests that are "fixed" by loosening constraints, adding exceptions, or disabling them entirely create a false sense of data quality. Every suppressed test is a future production incident.
When encountering a dbt error, first classify it into one of three categories. This determines the investigation approach:
Errors caused by the execution environment, not the dbt code itself.
| Error Pattern | Likely Cause | Urgency |
|---|---|---|
Connection refused / Connection timed out | Warehouse unreachable | High -- check warehouse status |
Quota exceeded / Resources exceeded | BigQuery slot/billing limits | High -- may need quota increase |
Slot contention / Query timed out | Concurrent query pressure | Medium -- reschedule or optimize |
Authentication failed / Permission denied | Credentials expired or role missing | High -- check service account |
Disk space / Memory exceeded | Worker resource limits | Medium -- optimize query or increase resources |
Rate limit exceeded | API throttling | Low -- add retry logic or reduce concurrency |
Network error / DNS resolution failed | Network connectivity | High -- infrastructure issue |
Resolution approach: Infrastructure errors are not code bugs. Fix the environment, then re-run.
Errors in dbt project code that prevent compilation or execution.
| Error Pattern | Likely Cause | Urgency |
|---|---|---|
Compilation Error + ref('...') | Missing or misspelled model reference | Medium -- fix the ref |
Compilation Error + source('...') | Missing source definition | Medium -- add to sources.yml |
Parsing Error + YAML | Invalid YAML syntax | Low -- fix indentation/syntax |
Circular dependency detected | Model A refs B which refs A | High -- refactor model DAG |
Duplicate model name | Two models share a name | Medium -- rename one |
Undefined macro | Missing macro or wrong package | Medium -- check macro path |
SQL syntax error | Invalid SQL for target warehouse | Low -- fix SQL |
Jinja rendering error | Template syntax issue | Medium -- check Jinja logic |
Schema/contract violation | Model output doesn't match contract | Medium -- fix model or update contract |
Resolution approach: Read the error message carefully. The fix is almost always in the file referenced in the error.
The code compiles and runs, but tests detect data quality issues.
| Error Pattern | Likely Cause | Investigation |
|---|---|---|
not_null failure | NULL values in a required column | Profile the NULLs -- are they from a specific source or time range? |
unique failure | Duplicate values in a PK column | Find the duplicates -- is it a join fanout or source issue? |
relationships failure | Orphan foreign keys | Check if referenced records were deleted or never loaded |
accepted_values failure | Unexpected value in a categorical column | Check source for new values not yet mapped |
custom test failure | Business rule violation | Understand the rule, then investigate the data |
unit test failure | Transformation logic mismatch | Compare actual vs expected -- is the model or the test wrong? |
| Row count anomaly | Unexpected increase/decrease | Check source loads, dedup logic, join conditions |
Resolution approach: Always investigate the data first. The test may be correct and the data genuinely broken.
Follow these steps in order for any dbt failure:
Collect all available context before investigating:
# What exactly failed?
# Read the error message completely -- don't skim
# Check recent changes to the model
git log --oneline -10 -- models/path/to/failing_model.sql
# Check recent changes to the schema/tests
git log --oneline -10 -- models/path/to/failing_model.yml
# Check if the model's upstream dependencies changed
git log --oneline -10 -- models/path/to/upstream_model.sql
# Review the compiled SQL (for compilation issues)
cat target/compiled/project_name/models/path/to/failing_model.sql
# Review run results for timing and status
cat target/run_results.json | python3 -m json.tool
Information checklist:
Using the tables in Section 1, determine:
For Infrastructure Errors (A):
For Code Errors (B):
target/compiled/dbt parse to check for structural issuesdbt compile --select model_name to isolate compilationdbt_project.yml for configuration issuesFor Data/Test Failures (C):
Profile the failure -- Quantify the scope:
-- For not_null failures
SELECT COUNT(*) AS total_rows,
COUNT(*) - COUNT(column_name) AS null_count,
ROUND((COUNT(*) - COUNT(column_name)) / COUNT(*) * 100, 2) AS null_pct
FROM {{ ref('model_name') }}
-- For unique failures
SELECT column_name, COUNT(*) AS occurrences
FROM {{ ref('model_name') }}
GROUP BY column_name
HAVING COUNT(*) > 1
ORDER BY occurrences DESC
LIMIT 20
-- For relationships failures
SELECT child.fk_column, COUNT(*) AS orphan_count
FROM {{ ref('child_model') }} child
LEFT JOIN {{ ref('parent_model') }} parent
ON child.fk_column = parent.pk_column
WHERE parent.pk_column IS NULL
GROUP BY 1
ORDER BY 2 DESC
LIMIT 20
Identify the source -- Where do the bad records come from?
-- Check if NULLs correlate with a specific source or time range
SELECT _source_system, DATE(loaded_at) AS load_date, COUNT(*) AS null_records
FROM {{ ref('model_name') }}
WHERE column_name IS NULL
GROUP BY 1, 2
ORDER BY 3 DESC
Check upstream models -- Did the problem originate upstream?
-- Trace back through the DAG
SELECT COUNT(*) AS upstream_nulls
FROM {{ ref('upstream_model') }}
WHERE relevant_column IS NULL
Review recent data loads -- Did source data change?
-- Check for unusual load patterns
SELECT DATE(_loaded_at) AS load_date, COUNT(*) AS row_count
FROM {{ source('schema', 'table') }}
GROUP BY 1
ORDER BY 1 DESC
LIMIT 14
Check for schema drift -- Did source columns change type or meaning?
Once you understand the root cause:
Create a fix branch:
git checkout -b fix/model-name-error-description
Implement the fix in the correct location:
dbt_project.yml or model configAdd a preventive test if one doesn't exist:
# If the failure revealed a gap in test coverage, add a test
models:
- name: model_name
columns:
- name: column_that_broke
tests:
- not_null
- unique
Validate the fix:
# Run the specific model
dbt run --select model_name
# Run the model's tests
dbt test --select model_name
# Run the model with its upstream dependencies
dbt build --select +model_name
Run regression to ensure the fix doesn't break other models:
# Check downstream models
dbt build --select model_name+
After resolution, document the finding:
Update execution_log.md (if using Wire workflow):
## [Date] - [Model Name] - [Error Type]
- **Error:** [Brief description]
- **Root Cause:** [What actually went wrong]
- **Fix:** [What was changed]
- **Prevention:** [Test or safeguard added]
Create a Jira ticket if the issue is recurring or systemic
Update team documentation if this reveals a pattern others should know about
Resources exceeded during query execution: The query could not be executed in the allotted memory.
Investigation:
Resolution patterns:
WHERE clauses to filter by partition column (usually _PARTITIONTIME or a date column)require_partition_filter: true to the model configAPPROX_COUNT_DISTINCT() instead of COUNT(DISTINCT ...) where precision is acceptableCannot query over table 'project.dataset.table' without a filter over column(s) '_PARTITIONTIME'
Resolution: Add a partition filter in the model's WHERE clause. For incremental models:
{% if is_incremental() %}
WHERE _PARTITIONTIME >= (SELECT MAX(_PARTITIONTIME) FROM {{ this }})
{% endif %}
Query timed out. Job exceeded maximum execution time.
Investigation:
Resolution patterns:
No matching signature for operator = for argument types: BYTES, STRING
Resolution: Cast explicitly:
SAFE_CONVERT_BYTES_TO_STRING(bytes_column) = 'expected_value'
-- or
CAST(string_column AS BYTES) = bytes_column
Cannot access field 'name' on a value with type ARRAY<STRUCT<...>>
Resolution: Use UNNEST() for arrays, dot notation for structs:
-- Accessing STRUCT field
SELECT event.event_name.value AS event_name FROM events
-- Unnesting ARRAY of STRUCT
SELECT event_id, param.key, param.value.string_value
FROM events, UNNEST(event_params) AS param
UPDATE/MERGE must match at most one source row for each target row
Investigation: The unique key has duplicates in the source query.
Resolution:
unique_key in model config truly identifies unique rowsunique test on the unique key columnsExceeded rate limits: too many table update operations for this table
Resolution:
merge_update_columns to limit the update scopeAfter a dbt Cloud job fails, examine the artifacts:
# Run results contain timing and status for each node
# Available at: target/run_results.json
# Key fields: status, execution_time, message, failures
# Manifest contains the full project graph
# Available at: target/manifest.json
# Useful for checking compiled SQL and dependencies
# Check compiled SQL for a specific model
cat target/compiled/<project_name>/models/<path>/<model>.sql
dbt Cloud logs show the full execution sequence. Look for:
SKIP status means the model was skipped due to an upstream failure.# Check job status
dbt cloud job list
# Re-run from failure point
dbt retry
# Run with debug logging
dbt --debug run --select model_name
For programmatic access to dbt Cloud job information:
GET /api/v2/accounts/{account_id}/runs/GET /api/v2/accounts/{account_id}/runs/{run_id}/GET /api/v2/accounts/{account_id}/runs/{run_id}/artifacts/{path}dbt deps FailuresERROR: Could not find a version that satisfies the requirement
Resolution:
packages.yml for version constraintsdbt clean then dbt deps to refresh packagesdbt seed FailuresRuntime Error: maximum recursion depth exceeded
Resolution:
dbt snapshot FailuresCompilation Error: Snapshot 'model' has no 'unique_key' configured
Resolution:
unique_key to snapshot configstrategy is set (timestamp or check)Anti-pattern: "I've spent 3 hours on this approach, I'll keep trying."
Better: If an approach isn't working after reasonable effort, step back:
Anti-pattern: "This test fails sometimes but usually passes. I'll just re-run."
Better: Flaky tests indicate a real problem:
ORDER BY in window functions)Always investigate and fix the root cause of flaky tests.
Anti-pattern: "We need this deployed today, I'll disable the failing test."
Better:
Never silently disable a test. At minimum, add a comment explaining why and a ticket reference.
Anti-pattern: Adding WHERE column IS NOT NULL to fix a not_null test failure.
Better: Ask "why is this column NULL?" The answer determines the fix:
coalesce() in staging with a documented defaultnot_null test and update documentation/wire:dbt-validate FailuresWhen Wire's validate command reports failures:
.wire/{project}/testing/ to understand what failed.wire/{project}/dev/data_model_*.wire/{project}/testing/test_results_*.wire/{project}/dev/dbt_project_config_*When troubleshooting reveals issues that need tracking:
/wire:dbt-validate {project_id}If troubleshooting does not resolve the issue within a reasonable timeframe:
-- Comprehensive column profile for investigation
SELECT
COUNT(*) AS total_rows,
COUNT({{ column_name }}) AS non_null_count,
COUNT(*) - COUNT({{ column_name }}) AS null_count,
ROUND((COUNT(*) - COUNT({{ column_name }})) / COUNT(*) * 100, 2) AS null_pct,
COUNT(DISTINCT {{ column_name }}) AS distinct_count,
MIN({{ column_name }}) AS min_value,
MAX({{ column_name }}) AS max_value
FROM {{ ref('model_name') }}
-- Identify duplicates with context
WITH duplicates AS (
SELECT
{{ unique_key }},
COUNT(*) AS occurrence_count
FROM {{ ref('model_name') }}
GROUP BY {{ unique_key }}
HAVING COUNT(*) > 1
)
SELECT
m.*,
d.occurrence_count
FROM {{ ref('model_name') }} m
INNER JOIN duplicates d ON m.{{ unique_key }} = d.{{ unique_key }}
ORDER BY d.occurrence_count DESC, m.{{ unique_key }}
LIMIT 100
-- Daily row count trend for anomaly detection
SELECT
DATE({{ timestamp_column }}) AS record_date,
COUNT(*) AS row_count,
LAG(COUNT(*)) OVER (ORDER BY DATE({{ timestamp_column }})) AS prev_day_count,
ROUND(
(COUNT(*) - LAG(COUNT(*)) OVER (ORDER BY DATE({{ timestamp_column }})))
/ NULLIF(LAG(COUNT(*)) OVER (ORDER BY DATE({{ timestamp_column }})), 0) * 100,
1
) AS pct_change
FROM {{ ref('model_name') }}
WHERE {{ timestamp_column }} >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
GROUP BY 1
ORDER BY 1 DESC
-- Trace a specific record through the DAG
-- Start from the failing model and work backward
-- Step 1: Find the problematic record in the failing model
SELECT * FROM {{ ref('failing_model') }}
WHERE {{ problem_condition }}
LIMIT 10;
-- Step 2: Check the same record in the upstream model
SELECT * FROM {{ ref('upstream_model') }}
WHERE {{ join_key }} IN (
SELECT {{ join_key }} FROM {{ ref('failing_model') }}
WHERE {{ problem_condition }}
);
-- Step 3: Check the source
SELECT * FROM {{ source('schema', 'table') }}
WHERE {{ source_key }} IN (
SELECT {{ source_key }} FROM {{ ref('staging_model') }}
WHERE {{ join_key }} IN (
SELECT {{ join_key }} FROM {{ ref('failing_model') }}
WHERE {{ problem_condition }}
)
);
error-catalog.md for a quick-reference table of common errors by class with resolution patternsAdapted from dbt-labs/dbt-agent-skills (Apache-2.0 License). Original skill: troubleshooting-dbt-job-errors. Modified for Rittman Analytics conventions, BigQuery focus, Wire Framework integration, and expanded error catalog.