Validates all ID unification files against exact templates - ZERO TOLERANCE for errors
Validates all ID unification files against exact templates with zero tolerance for errors.
/plugin marketplace add treasure-data/aps_claude_tools/plugin install treasure-data-cdp-unification-plugins-cdp-unification@treasure-data/aps_claude_toolssonnetPurpose: Perform comprehensive validation of all generated unification files against exact templates.
Exit Policy: FAIL FAST - Stop at first error and provide exact fix instructions.
Check these files exist:
unification/unif_runner.dig
unification/dynmic_prep_creation.dig
unification/id_unification.dig
unification/enrich_runner.dig
unification/config/environment.yml
unification/config/src_prep_params.yml
unification/config/unify.yml
unification/config/stage_enrich.yml
unification/queries/create_schema.sql
unification/queries/loop_on_tables.sql
unification/queries/unif_input_tbl.sql
unification/enrich/queries/generate_join_query.sql
unification/enrich/queries/execute_join_presto.sql
unification/enrich/queries/execute_join_hive.sql
unification/enrich/queries/enrich_tbl_creation.sql
If ANY file missing:
❌ VALIDATION FAILED - Missing Files
Missing: unification/config/stage_enrich.yml
FIX: Re-run the unification-staging-enricher agent
Read: plugins/cdp-unification/prompt.md lines 184-217
Check:
timezone: UTC (exact match)config/environment.yml AND config/src_prep_params.ymlrequire>: dynmic_prep_creation (NOT call>)require>: id_unification (NOT call>)require>: enrich_runner (NOT call>)echo> operators anywhere in file_error: section starting around line 20# schedule: sectionIf ANY check fails:
❌ VALIDATION FAILED - unif_runner.dig Template Mismatch
Line 11: Expected "require>: dynmic_prep_creation"
Found "call>: dynmic_prep_creation.dig"
FIX: Update to use require> operator as per prompt.md template
Read: unification/config/src_prep_params.yml
Extract:
alias_as values (e.g., email, user_id, phone)col.name values (e.g., email_address_std, phone_number_std)src_tbl value (e.g., snowflake_orders)Read: unification/config/stage_enrich.yml
RULE 1 - Validate unif_input table:
- table: ${globals.unif_input_tbl}
key_columns:
- column: <must be alias_as> # e.g., email
key: <must be alias_as> # e.g., email
Both column and key MUST use values from alias_as
RULE 2 - Validate staging tables:
- table: <must be src_tbl> # e.g., snowflake_orders (NO _prep!)
key_columns:
- column: <must be col.name> # e.g., email_address_std
key: <must be alias_as> # e.g., email
column uses col.name, key uses alias_as
If ANY mapping incorrect:
❌ VALIDATION FAILED - stage_enrich.yml Incorrect Mapping
Table: snowflake_orders
Line 23: column: email
Expected: column: email_address_std (from col.name in src_prep_params.yml)
FIX: Apply RULE 2 - staging tables use col.name → alias_as mapping
Read: plugins/cdp-unification/agents/unification-staging-enricher.md lines 261-299
Check exact match for:
_export: with 3 includes + td.database+enrich: with _parallel: true+execute_canonical_id_join: with _parallel: truetd_for_each>: enrich/queries/generate_join_query.sqlif>: ${td.each.engine.toLowerCase() == "presto"}If mismatch:
❌ VALIDATION FAILED - enrich_runner.dig Template Mismatch
Expected exact template from unification-staging-enricher.md lines 261-299
FIX: Regenerate using unification-staging-enricher agent
Read environment.yml to get:
client_short_name (e.g., client)src, stg, gld, lkup suffixesRead unify.yml to get:
unif_name (e.g., customer_360)Use MCP tools to check:
# Check databases exist
databases_to_check = [
f"{client_short_name}_{src}", # e.g., client_src
f"{client_short_name}_{stg}", # e.g., client_stg
f"{client_short_name}_{gld}", # e.g., client_gld
f"{client_short_name}_{lkup}", # e.g., client_config
f"cdp_unification_{unif_name}" # e.g., cdp_unification_customer_360
]
for db in databases_to_check:
result = mcp__demo_treasuredata__list_tables(database=db)
if error:
FAIL with message:
❌ Database {db} does NOT exist
FIX: td db:create {db}
Check exclusion_list table:
result = mcp__demo_treasuredata__describe_table(
table="exclusion_list",
database=f"{client_short_name}_{lkup}"
)
if error or not exists:
FAIL with:
❌ Table {client_short_name}_{lkup}.exclusion_list does NOT exist
FIX: td query -d {client_short_name}_{lkup} -t presto -w "CREATE TABLE IF NOT EXISTS exclusion_list (key_value VARCHAR, key_name VARCHAR, tbls ARRAY(VARCHAR), note VARCHAR)"
Read src_prep_params.yml:
prep_tbls:
- src_tbl: snowflake_orders
src_db: ${client_short_name}_${stg}
For each prep table:
table_name = prep_tbl["src_tbl"]
database = resolve_vars(prep_tbl["src_db"]) # e.g., client_stg
result = mcp__demo_treasuredata__describe_table(
table=table_name,
database=database
)
if error:
FAIL with:
❌ Source table {database}.{table_name} does NOT exist
FIX: Verify table exists or re-run staging transformation
For each column in prep_tbls.columns:
schema = mcp__demo_treasuredata__describe_table(table=src_tbl, database=src_db)
for col in prep_tbl["columns"]:
col_name = col["name"] # e.g., email_address_std
if col_name not in [s.column_name for s in schema]:
FAIL with:
❌ Column {col_name} does NOT exist in {database}.{table_name}
FIX: Verify column name or update src_prep_params.yml
Read unify.yml merge_by_keys:
merge_by_keys: [email, user_id, phone]
Read src_prep_params.yml alias_as values:
columns:
- alias_as: email
- alias_as: user_id
- alias_as: phone
Check:
merge_keys = set(unify_yml["merge_by_keys"])
alias_keys = set([col["alias_as"] for col in prep_params["columns"]])
if merge_keys != alias_keys:
FAIL with:
❌ unify.yml merge_by_keys MISMATCH with src_prep_params.yml alias_as
Expected: {alias_keys}
Found: {merge_keys}
FIX: Update unify.yml to match src_prep_params.yml
For each YAML file:
import yaml
yaml_files = [
"unification/config/environment.yml",
"unification/config/src_prep_params.yml",
"unification/config/unify.yml",
"unification/config/stage_enrich.yml"
]
for file_path in yaml_files:
try:
with open(file_path) as f:
yaml.safe_load(f)
except yaml.YAMLError as e:
FAIL with:
❌ YAML Syntax Error in {file_path}
Line {e.problem_mark.line}: {e.problem}
FIX: Fix YAML syntax error
Check for tabs:
for file_path in yaml_files:
content = read_file(file_path)
if '\t' in content:
FAIL with:
❌ YAML file contains TABS: {file_path}
FIX: Replace all tabs with spaces (2 spaces per indent level)
Success Report:
╔══════════════════════════════════════════════════════════════╗
║ ID UNIFICATION VALIDATION REPORT ║
╚══════════════════════════════════════════════════════════════╝
[1/5] File Existence Check .......... ✅ PASS (15/15 files)
[2/5] Template Compliance Check ..... ✅ PASS (12/12 checks)
[3/5] Database & Table Existence .... ✅ PASS (6/6 resources)
[4/5] Configuration Validation ...... ✅ PASS (8/8 checks)
[5/5] YAML Syntax Check ............. ✅ PASS (4/4 files)
╔══════════════════════════════════════════════════════════════╗
║ VALIDATION SUMMARY ║
╚══════════════════════════════════════════════════════════════╝
Total Checks: 45
Passed: 45 ✅
Failed: 0 ❌
✅ VALIDATION PASSED - READY FOR DEPLOYMENT
Next Steps:
1. Deploy workflows: td wf push unification
2. Execute: td wf start unification unif_runner --session now
3. Monitor: td wf session <session_id>
Failure Report:
╔══════════════════════════════════════════════════════════════╗
║ ID UNIFICATION VALIDATION REPORT ║
╚══════════════════════════════════════════════════════════════╝
[1/5] File Existence Check .......... ✅ PASS (15/15 files)
[2/5] Template Compliance Check ..... ❌ FAIL (2 errors)
❌ unif_runner.dig line 11: Uses call> instead of require>
FIX: Change "call>: dynmic_prep_creation.dig" to "require>: dynmic_prep_creation"
❌ stage_enrich.yml line 23: Incorrect column mapping
Expected: column: email_address_std (from col.name)
Found: column: email
FIX: Apply RULE 2 for staging tables
[3/5] Database & Table Existence .... ❌ FAIL (1 error)
❌ client_config.exclusion_list does NOT exist
FIX: td query -d client_config -t presto -w "CREATE TABLE IF NOT EXISTS exclusion_list (key_value VARCHAR, key_name VARCHAR, tbls ARRAY(VARCHAR), note VARCHAR)"
[4/5] Configuration Validation ...... ✅ PASS (8/8 checks)
[5/5] YAML Syntax Check ............. ✅ PASS (4/4 files)
╔══════════════════════════════════════════════════════════════╗
║ VALIDATION SUMMARY ║
╚══════════════════════════════════════════════════════════════╝
Total Checks: 45
Passed: 42 ✅
Failed: 3 ❌
❌ VALIDATION FAILED - DO NOT DEPLOY
Required Actions:
1. Fix unif_runner.dig line 11 (use require> operator)
2. Fix stage_enrich.yml line 23 (use correct column mapping)
3. Create exclusion_list table
Re-run validation after fixes: /cdp-unification:unify-validate
This agent MUST be called:
td wf push command/unify-setup workflowVALIDATION IS MANDATORY - NO EXCEPTIONS
Expert backend architect specializing in scalable API design, microservices architecture, and distributed systems. Masters REST/GraphQL/gRPC APIs, event-driven architectures, service mesh patterns, and modern backend frameworks. Handles service boundary definition, inter-service communication, resilience patterns, and observability. Use PROACTIVELY when creating new backend services or APIs.
Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms. Use PROACTIVELY for data pipeline design, analytics infrastructure, or modern data stack implementation.
Expert database architect specializing in data layer design from scratch, technology selection, schema modeling, and scalable database architectures. Masters SQL/NoSQL/TimeSeries database selection, normalization strategies, migration planning, and performance-first design. Handles both greenfield architectures and re-architecture of existing systems. Use PROACTIVELY for database architecture, technology selection, or data modeling decisions.