From astronomer-data
Traces downstream data lineage for tables and DAGs to identify dependents, build impact trees, categorize criticality, and assess change risks before modifications.
npx claudepluginhub astronomer/agents --plugin astronomer-dataThis skill uses the workspace's default tool permissions.
Answer the critical question: "What breaks if I change this?"
Traces upstream data lineage for Airflow tables, columns, and DAGs via CLI commands, source code, and UI. Identifies SQL sources, external systems like Postgres and Salesforce.
Explores DataHub lineage: traces upstream/downstream data dependencies, performs impact analysis, root cause investigation, and maps pipelines.
Verifies ETL/ELT pipeline quality, data contracts, idempotency, and test coverage across dbt, Airflow, Dagster, and Prefect. Analyzes DAG structure, transformations, and data checks for PR reviews and audits.
Share bugs, ideas, or general feedback.
Answer the critical question: "What breaks if I change this?"
Use this BEFORE making changes to understand the blast radius.
Find everything that reads from this target:
For Tables:
Search DAG source code: Look for DAGs that SELECT from this table
af dags list to get all DAGsaf dags source <dag_id> to search for table referencesFROM target_table, JOIN target_tableCheck for dependent views:
-- Snowflake
SELECT * FROM information_schema.view_table_usage
WHERE table_name = '<target_table>'
-- Or check SHOW VIEWS and search definitions
Look for BI tool connections:
If you're running on Astro, the Lineage tab in the Astro UI provides visual dependency graphs across DAGs and datasets, making downstream impact analysis faster. It shows which DAGs consume a given dataset and their current status, reducing the need for manual source code searches.
For DAGs:
af dags source <dag_id> to find output tablesMap the full downstream impact:
SOURCE: fct.orders
|
+-- TABLE: agg.daily_sales --> Dashboard: Executive KPIs
| |
| +-- TABLE: rpt.monthly_summary --> Email: Monthly Report
|
+-- TABLE: ml.order_features --> Model: Demand Forecasting
|
+-- DIRECT: Looker Dashboard "Sales Overview"
Critical (breaks production):
High (causes significant issues):
Medium (inconvenient):
Low (minimal impact):
For the proposed change, evaluate:
Schema Changes (adding/removing/renaming columns):
Data Changes (values, volumes, timing):
Deletion/Deprecation:
Identify who owns downstream assets:
owners field in DAG definitions"Changing fct.orders will impact X tables, Y DAGs, and Z dashboards"
+--> [agg.daily_sales] --> [Executive Dashboard]
|
[fct.orders] -------+--> [rpt.order_details] --> [Ops Team Email]
|
+--> [ml.features] --> [Demand Model]
| Downstream | Type | Criticality | Owner | Notes |
|---|---|---|---|---|
| agg.daily_sales | Table | Critical | data-eng | Updated hourly |
| Executive Dashboard | Dashboard | Critical | analytics | CEO views daily |
| ml.order_features | Table | High | ml-team | Retraining weekly |
| Change Type | Risk Level | Mitigation |
|---|---|---|
| Add column | Low | No action needed |
| Rename column | High | Update 3 DAGs, 2 dashboards |
| Delete column | Critical | Full migration plan required |
| Change data type | Medium | Test downstream aggregations |
Before making changes:
transform_daily_sales