From astronomer-data
Checks database table freshness via SQL queries on timestamp columns and Airflow DAG status. Use when verifying if data is up to date or stale before analysis.
npx claudepluginhub astronomer/agents --plugin astronomer-dataThis skill uses the workspace's default tool permissions.
Quickly determine if data is fresh enough to use.
Checks data freshness, schema drift, null rates, orphaned records, and pipeline status across databases, Airflow DAGs, dbt models, BigQuery, and Snowflake.
Traces upstream data lineage for Airflow tables, columns, and DAGs via CLI commands, source code, and UI. Identifies SQL sources, external systems like Postgres and Salesforce.
Audits PostHog data warehouse health, detecting broken or degraded items across sources, sync schemas, materialized views, batch exports, and transformations. Delivers prioritized issue report with next steps for triage.
Share bugs, ideas, or general feedback.
Quickly determine if data is fresh enough to use.
For each table to check:
Look for columns that indicate when data was loaded or updated:
_loaded_at, _updated_at, _created_at (common ETL patterns)updated_at, created_at, modified_at (application timestamps)load_date, etl_timestamp, ingestion_timedate, event_date, transaction_date (business dates)Query INFORMATION_SCHEMA.COLUMNS if you need to see column names.
SELECT
MAX(<timestamp_column>) as last_update,
CURRENT_TIMESTAMP() as current_time,
TIMESTAMPDIFF('hour', MAX(<timestamp_column>), CURRENT_TIMESTAMP()) as hours_ago,
TIMESTAMPDIFF('minute', MAX(<timestamp_column>), CURRENT_TIMESTAMP()) as minutes_ago
FROM <table>
For tables with regular updates, check recent activity:
SELECT
DATE_TRUNC('day', <timestamp_column>) as day,
COUNT(*) as row_count
FROM <table>
WHERE <timestamp_column> >= DATEADD('day', -7, CURRENT_DATE())
GROUP BY 1
ORDER BY 1 DESC
Report status using this scale:
| Status | Age | Meaning |
|---|---|---|
| Fresh | < 4 hours | Data is current |
| Stale | 4-24 hours | May be outdated, check if expected |
| Very Stale | > 24 hours | Likely a problem unless batch job |
| Unknown | No timestamp | Can't determine freshness |
Check Airflow for the source pipeline:
Find the DAG: Which DAG populates this table? Use af dags list and look for matching names.
Check DAG status:
af dags get <dag_id>af dags statsDiagnose if needed: If the DAG failed, use the debugging-dags skill to investigate.
If you're running on Astro, you can also:
Provide a clear, scannable report:
FRESHNESS REPORT
================
TABLE: database.schema.table_name
Last Update: 2024-01-15 14:32:00 UTC
Age: 2 hours 15 minutes
Status: Fresh
TABLE: database.schema.other_table
Last Update: 2024-01-14 03:00:00 UTC
Age: 37 hours
Status: Very Stale
Source DAG: daily_etl_pipeline (FAILED)
Action: Investigate with **debugging-dags** skill
If user just wants a yes/no answer: