Transform a single database table to staging format with data quality improvements, PII handling, and JSON extraction
Transforms database tables to staging format with data quality improvements and PII handling.
/plugin marketplace add treasure-data/aps_claude_tools/plugin install treasure-data-cdp-staging-plugins-cdp-staging@treasure-data/aps_claude_toolsI'll help you transform a database table to staging format using the appropriate staging-transformer sub-agent (Presto/Trino or Hive).
FIRST, use the AskUserQuestion tool to interactively collect all required parameters.
Call AskUserQuestion with these questions:
{
"questions": [
{
"question": "Which table do you want to transform? (Format: database.table_name, e.g., client_src.customers_histunion)",
"header": "Source Table",
"multiSelect": false,
"options": [
{
"label": "client_src.*",
"description": "Table from client_src database - I'll specify the table name"
},
{
"label": "demo_db.*",
"description": "Table from demo_db database - I'll specify the table name"
}
]
},
{
"question": "Which SQL engine should be used for the transformation?",
"header": "SQL Engine",
"multiSelect": false,
"options": [
{
"label": "Presto/Trino",
"description": "Modern SQL engine with better performance (recommended, default)"
},
{
"label": "Hive",
"description": "MapReduce-based engine for batch processing or large datasets"
}
]
},
{
"question": "Staging database name? (Default: client_stg)",
"header": "Staging DB",
"multiSelect": false,
"options": [
{
"label": "client_stg",
"description": "Use default staging database (recommended)"
}
]
},
{
"question": "Config/Lookup database name? (Default: client_config)",
"header": "Config DB",
"multiSelect": false,
"options": [
{
"label": "client_config",
"description": "Use default config database (recommended)"
}
]
}
]
}
After collecting answers:
Then delegate to appropriate staging-transformer agent with all parameters.
I will determine the appropriate sub-agent:
staging-transformer-prestostaging-transformer-hivestaging-transformer-presto (default)I will invoke the appropriate staging-transformer agent with:
The specialized agent will:
staging/init_queries/{source_db}_{table_name}_init.sql (Presto)staging/queries/{source_db}_{table_name}.sql (Presto)staging/queries/{source_db}_{table_name}_upsert.sql (if dedup, Presto)staging_hive/queries/{source_db}_{table_name}.sql (Hive)staging/config/src_params.yml or staging_hive/config/src_params.ymlstaging/staging_transformation.dig or staging_hive/staging_hive.digThe sub-agent ensures complete compliance with all requirements:
✅ Column Limit Management (max 200 columns) ✅ JSON Detection & Extraction (automatic) ✅ Date Processing (4 outputs per date column) ✅ Email/Phone Validation (with hashing) ✅ String Standardization (UPPER, TRIM, NULL handling) ✅ Deduplication Logic (if configured) ✅ Join Processing (if specified) ✅ Incremental Processing (state tracking) ✅ SQL File Creation (init, incremental, upsert) ✅ DIG File Management (conditional creation) ✅ Configuration Update (src_params.yml) ✅ Git Workflow (complete automation) ✅ Treasure Data Compatibility (VARCHAR/BIGINT timestamps)
staging/init_queries/{source_db}_{table_name}_init.sql - Initial load SQLstaging/queries/{source_db}_{table_name}.sql - Incremental SQLstaging/queries/{source_db}_{table_name}_upsert.sql - Upsert SQL (if dedup)staging/config/src_params.yml - Updated configurationstaging/staging_transformation.dig - Workflow (created if not exists)staging_hive/queries/{source_db}_{table_name}.sql - Combined SQLstaging_hive/config/src_params.yml - Updated configurationstaging_hive/staging_hive.dig - Workflow (created if not exists)staging_hive/queries/get_max_time.sql - Template (created if not exists)staging_hive/queries/get_stg_rows_for_delete.sql - Template (created if not exists)User: Transform table client_src.customer_profiles_histunion
→ Engine: Presto (default)
→ Sub-agent: staging-transformer-presto
→ Output: staging/ directory files
User: Transform table client_src.klaviyo_events_histunion using Hive
→ Engine: Hive
→ Sub-agent: staging-transformer-hive
→ Output: staging_hive/ directory files
User: Transform demo_db.orders_histunion
Use demo_db_stg as staging database
Use client_config for lookup
→ Engine: Presto (default)
→ Custom databases applied
Review generated files:
ls -l staging/queries/
ls -l staging/init_queries/
cat staging/config/src_params.yml
Review Pull Request:
Test the workflow:
cd staging
td wf push
td wf run staging_transformation.dig
Monitor execution:
SELECT * FROM client_config.inc_log
WHERE table_name = '{your_table}'
ORDER BY inc_value DESC
LIMIT 1
All transformations will:
Ready to proceed? Please provide the source database and table name, and I'll delegate to the appropriate staging-transformer agent for complete processing.