From posthog
Tunes sync configurations for existing data warehouse schemas by switching sync types, updating incremental fields, primary keys, CDC modes, or frequencies. Use for fixing failing syncs, adjusting sync speed, or handling schema changes.
npx claudepluginhub anthropics/claude-plugins-official --plugin posthogThis skill uses the workspace's default tool permissions.
A sync's configuration lives on the `ExternalDataSchema` and can be changed any time via
Diagnoses failed data warehouse syncs from sources like Stripe, Postgres, Hubspot; covers source/schema failures, stuck states, credential/schema-drift errors; recommends cancel/reload/resync/delete-data.
Migrates Reverse ETL syncs from Census, Hightouch, Polytomic, or custom scripts to drt by mapping sources, destinations, schedules, and generating sync YAML configs.
Migrates Reverse ETL sync configurations from Census, Hightouch, Polytomic, or custom scripts to drt YAML files. Generates syncs/*.yml from user-provided configs and maps concepts like sources, destinations, modes.
Share bugs, ideas, or general feedback.
A sync's configuration lives on the ExternalDataSchema and can be changed any time via
external-data-schemas-partial-update. Most changes are non-destructive (take effect on the next sync), but a few
(switching sync_type, changing primary keys) require careful handling to avoid corrupting the synced data.
If the user is setting up a brand-new source, use setting-up-a-data-warehouse-source instead — configuration is
chosen at creation time there.
| Tool | Purpose |
|---|---|
external-data-schemas-retrieve | Current sync_type, incremental_field, PKs, sync_frequency |
external-data-schemas-incremental-fields-create | Refresh candidate incremental fields from the live source |
external-data-schemas-partial-update | Apply the config change |
external-data-schemas-reload | Trigger a sync with the new config |
external-data-schemas-resync | Wipe and re-import from scratch when the change invalidates existing data |
external-data-schemas-delete-data | Drop the synced table while keeping the schema entry |
external-data-sources-check-cdc-prerequisites-create | Pre-flight Postgres CDC (only when switching to/from CDC) |
external-data-sources-webhook-info-retrieve | Current webhook state (when switching to/from sync_type=webhook) |
external-data-sources-create-webhook-create | Register a webhook after switching a schema to sync_type=webhook |
external-data-sources-update-webhook-inputs-create | Rotate a webhook signing secret |
external-data-sources-delete-webhook-create | Unregister webhook when switching schemas off sync_type=webhook |
From the partial-update endpoint:
| Field | Values | Notes |
|---|---|---|
sync_type | full_refresh, incremental, append, cdc, webhook | Source must support the target type — check via incremental-fields |
incremental_field | Column name from the source | Must appear in incremental_fields list for the schema |
incremental_field_type | datetime, date, timestamp, integer, numeric, objectid | Must match the column's real type |
primary_key_columns | Array of column names | Required for CDC. Used for upsert dedup on incremental |
cdc_table_mode | consolidated, cdc_only, both | Only meaningful when sync_type=cdc |
sync_frequency | 1min, 5min, 15min, 30min, 1hour, 6hour, 12hour, 24hour, 7day, 30day, never | Applies to all non-CDC types |
sync_time_of_day | HH:MM:SS | When sync_frequency is daily/weekly-scale |
should_sync | true / false | Pause the schema without deleting it |
Always start with external-data-schemas-retrieve({id}). Understanding the current state prevents mistakes like
"fixing" an incremental_field that's actually correct.
Note:
sync_type, incremental_field, incremental_field_type, primary_key_columnsstatus (don't tune a schema that's currently Running — wait or cancel first)last_synced_at (so you can tell if the next sync worked)latest_error if present (the error often tells you exactly what to change)Call external-data-schemas-incremental-fields-create({id}). Even though the operation name says "create", it
re-reads the source and returns the current candidate fields — use it to confirm the field you want to set actually
exists on the source and which sync types are now available for this table.
The response:
{
"incremental_fields": [{"field": "updated_at", "type": "datetime", ...}, ...],
"incremental_available": true,
"append_available": true,
"cdc_available": true,
"full_refresh_available": true,
"detected_primary_keys": ["id"],
"available_columns": [...]
}
If your target incremental_field isn't in the list, tell the user — they need to either pick a different field or
change the source table to add one.
Call external-data-schemas-partial-update({id}, {...changed fields}).
Only send the fields that are actually changing. Partial update means unspecified fields stay as they are.
Examples:
// Switch from full_refresh to incremental
{
"sync_type": "incremental",
"incremental_field": "updated_at",
"incremental_field_type": "datetime"
}
// Change sync frequency to hourly
{"sync_frequency": "1hour"}
// Fix wrong PK on a CDC table
{"primary_key_columns": ["tenant_id", "order_id"]}
// Pause a schema
{"should_sync": false}
This is the step that's easy to get wrong. Some config changes invalidate the synced data; others don't.
Changes that DON'T invalidate existing data:
sync_frequency, sync_time_of_day — scheduling onlyshould_sync — on/offcdc_table_mode in most cases — next sync will start writing to the new shape, but historical consolidated rows
stay validincremental and full_refresh with the same incremental_field — next sync just re-runs
freshsync_type: "webhook" — the synced data stays valid; only the ingestion path changes.
Remember to register or unregister the webhook (see sections below) alongside the sync_type change.Changes that MAY invalidate existing data and need a resync:
incremental_field to a different column — the high-water mark is from the old column and won't match.
Without a resync you'll miss rows that were updated between the two fields' histories.primary_key_columns — existing rows may be deduplicated incorrectly against new PK definitions.full_refresh to append — the existing rows don't have the version-history shape that append
expects.append to full_refresh — opposite problem; you'll end up with duplicate historical versions.cdc — the table shape changes fundamentally.When the change invalidates data, the clean flow is:
external-data-schemas-partial-update with the new configexternal-data-schemas-resync to wipe and re-import under the new configOr equivalently, external-data-schemas-delete-data → external-data-schemas-reload. delete-data + reload is
cleaner when the table is large and the user wants to start from zero.
For non-destructive changes, call external-data-schemas-reload({id}) to pick up the new config immediately rather
than waiting for the schedule.
Wait a moment, then external-data-schemas-retrieve({id}) to confirm status = Running then Completed. Report
last_synced_at and any new latest_error.
incremental-fields-create to confirm the desired field exists and incremental_available: true.partial-update: {sync_type: "incremental", incremental_field, incremental_field_type}.external-data-sources-check-cdc-prerequisites-create on the parent source. Only proceed if valid: true.incremental-fields-create to confirm cdc_available: true and see detected_primary_keys.partial-update: {sync_type: "cdc", primary_key_columns: [...], cdc_table_mode: "consolidated"}.external-data-schemas-resync after the update.
Warn the user this wipes existing data.Source dropped the updated_at column. Sync has been failing with "column does not exist".
incremental-fields-create to see what fields remain.full_refresh if none are suitable).partial-update with the new field + type (or new sync_type).reload to retry.partial-update: {primary_key_columns: [...]}.resync, warn the user.partial-update: {sync_frequency: "1hour"}.sync_type: "webhook"Only works for sources that implement WebhookSource (today: Stripe) and tables where supports_webhooks: true
from incremental-fields-create.
incremental-fields-create to confirm supports_webhooks: true for the table.partial-update: {sync_type: "webhook"}.webhook-info-retrieve), call
external-data-sources-create-webhook-create({source_id}) to register it.sync_frequency set (e.g. 24hour) — it acts as a safety-net reconciliation in case any webhook delivery
is missed.sync_type: "webhook"partial-update: {sync_type: "incremental"} (or whatever bulk type is appropriate) with the required
incremental_field + incremental_field_type.sync_type: "webhook", call
external-data-sources-delete-webhook-create({source_id}) to unregister. Leaving an orphaned webhook
registered on the source side just means events will be received and dropped — not harmful, but messy.The source's signing secret (e.g. Stripe's whsec_...) was rotated, and payloads are now failing signature
verification.
external-data-sources-update-webhook-inputs-create({source_id}, {inputs: {signing_secret: "whsec_..."}}).partial-update: {should_sync: false}. Schema stops syncing but stays configured.partial-update: {should_sync: true}, then reload for an immediate run.partial-update doesn't complain if you set a
field to the value it already had, but you might be about to change something you didn't realize was already set.incremental-fields-create response tells you what's
available right now, which can be different from what was available at creation (e.g. CDC may have been
enabled for the team since).sync_type: "cdc" without running check-cdc-prerequisites-create
first. The sync will just fail immediately.external-data-schemas-cancel before applying the change. Updating config mid-sync can leave the incremental
high-water mark inconsistent.