Complete end-to-end hybrid ID unification setup - automatically analyzes tables, generates config, creates SQL, and executes workflow for Snowflake and Databricks
Automates end-to-end hybrid ID unification setup for Snowflake and Databricks, analyzing tables, generating configuration and SQL, and executing workflows.
/plugin marketplace add treasure-data/aps_claude_tools/plugin install treasure-data-cdp-hybrid-idu-plugins-cdp-hybrid-idu@treasure-data/aps_claude_toolsI'll guide you through the complete hybrid ID unification setup process for Snowflake and/or Databricks platforms. This is an automated, end-to-end workflow that will:
Key Features:
database.schema.table or schema.table or tablecatalog.schema.table or schema.table or tabletd_id, unified_customer_id)Note: The system will automatically:
For Databricks:
For Snowflake:
For Databricks:
For Snowflake:
I'll use the hybrid-unif-config-creator command to automatically generate your unify.yml file:
Automated Analysis Approach (Recommended):
What I'll do:
Alternative - Manual Configuration:
I'll help you:
For Databricks (if selected): I'll call the databricks-sql-generator agent to:
yaml_unification_to_databricks.py scriptdatabricks_sql/unify/For Snowflake (if selected): I'll call the snowflake-sql-generator agent to:
yaml_unification_to_snowflake.py scriptsnowflake_sql/unify/For Databricks (if execution requested): I'll call the databricks-workflow-executor agent to:
databricks_sql_executor.py scriptFor Snowflake (if execution requested): I'll call the snowflake-workflow-executor agent to:
snowflake_sql_executor.py scriptI'll provide:
This command orchestrates the complete end-to-end flow by calling specialized commands in sequence:
I'll ask you for:
Then I'll:
/cdp-hybrid-idu:hybrid-unif-config-creator internallyunify.yml with strict PII detectionI'll ask you:
Then I'll:
/cdp-hybrid-idu:hybrid-generate-snowflake (if Snowflake selected)/cdp-hybrid-idu:hybrid-generate-databricks (if Databricks selected)I'll ask you:
Then I'll:
/cdp-hybrid-idu:hybrid-execute-snowflake (if Snowflake selected)/cdp-hybrid-idu:hybrid-execute-databricks (if Databricks selected)Throughout the process:
For Databricks:
databricks_sql/unify/
āāā 01_create_graph.sql # Initialize identity graph
āāā 02_extract_merge.sql # Extract and merge identities
āāā 03_source_key_stats.sql # Source statistics
āāā 04_unify_loop_iteration_*.sql # Iterative unification (N files)
āāā 05_canonicalize.sql # Canonical ID creation
āāā 06_result_key_stats.sql # Result statistics
āāā 10_enrich_*.sql # Source table enrichment (N files)
āāā 20_master_*.sql # Master table creation (N files)
āāā 30_unification_metadata.sql # Metadata tables
āāā 31_filter_lookup.sql # Validation rules
āāā 32_column_lookup.sql # Column mappings
For Snowflake:
snowflake_sql/unify/
āāā 01_create_graph.sql # Initialize identity graph
āāā 02_extract_merge.sql # Extract and merge identities
āāā 03_source_key_stats.sql # Source statistics
āāā 04_unify_loop_iteration_*.sql # Iterative unification (N files)
āāā 05_canonicalize.sql # Canonical ID creation
āāā 06_result_key_stats.sql # Result statistics
āāā 10_enrich_*.sql # Source table enrichment (N files)
āāā 20_master_*.sql # Master table creation (N files)
āāā 30_unification_metadata.sql # Metadata tables
āāā 31_filter_lookup.sql # Validation rules
āāā 32_column_lookup.sql # Column mappings
Configuration:
unify.yml # YAML configuration (created interactively)
All generated files will:
Ready to begin? I'll use the hybrid-unif-config-creator to automatically analyze your tables and generate the YAML configuration.
Please provide:
Platform: Which platform contains your data?
Tables: Which source tables should I analyze?
database.schema.table or schema.table or tablecatalog.schema.table or schema.table or tablecustomer_db.public.customers, orders, web_events.user_activityCanonical ID Name: What should I call the unified ID?
td_id, unified_customer_id, master_idtd_idMerge Iterations (optional): How many unification loops?
Target Platform(s) for SQL generation:
Example:
I want to set up hybrid ID unification for:
Platform: Snowflake
Tables:
- customer_db.public.customer_profiles
- customer_db.public.orders
- marketing_db.public.campaigns
- event_db.public.web_events
Canonical ID: unified_customer_id
Merge Iterations: 10
Generate SQL for: Snowflake (or both Snowflake and Databricks)
What I'll do next:
Let's get started with your hybrid ID unification setup!