Execute comprehensive platform migrations to Databricks from legacy systems. Use when migrating from on-premises Hadoop, other cloud platforms, or legacy data warehouses to Databricks. Trigger with phrases like "migrate to databricks", "hadoop migration", "snowflake to databricks", "legacy migration", "data warehouse migration".
From databricks-packnpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin databricks-packThis skill is limited to using the following tools:
references/implementation.mdGuides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Designs, audits, and improves analytics tracking systems using Signal Quality Index for reliable, decision-ready data in marketing, product, and growth.
Enforces A/B test setup with gates for hypothesis locking, metrics definition, sample size calculation, assumptions checks, and execution readiness before implementation.
Comprehensive migration strategies for moving to Databricks from Hadoop, Snowflake, Redshift, Synapse, or legacy data warehouses.
| Source | Pattern | Complexity | Timeline |
|---|---|---|---|
| On-prem Hadoop | Lift-and-shift + modernize | High | 6-12 months |
| Snowflake | Parallel run + cutover | Medium | 3-6 months |
| AWS Redshift | ETL rewrite + data copy | Medium | 3-6 months |
| Legacy DW (Oracle/Teradata) | Full rebuild | High | 12-18 months |
Inventory all source tables with metadata (size, partitions, dependencies, data classification). Generate prioritized migration plan with wave assignments.
Convert source schemas to Delta Lake compatible types. Handle type conversions (char->string, tinyint->int). Enable auto-optimize on target tables.
Batch large tables by partition. Validate row counts and schema match after each table migration.
Convert spark-submit/Oozie jobs to Databricks jobs. Update paths, remove Hive metastore references, adapt for Unity Catalog.
Execute 6-step cutover: validate -> disable source -> final sync -> enable Databricks -> update apps -> monitor. Each step has rollback procedure.
See detailed implementation for assessment scripts, schema conversion, data migration with batching, ETL conversion, and cutover plan generation.
| Error | Cause | Solution |
|---|---|---|
| Schema incompatibility | Unsupported types | Use type conversion mappings |
| Data loss | Truncation during migration | Validate counts at each step |
| Performance issues | Large tables | Use partitioned migration |
| Dependency conflicts | Wrong migration order | Analyze dependencies first |
SELECT 'source' as system, COUNT(*) FROM hive_metastore.db.table
UNION ALL SELECT 'target' as system, COUNT(*) FROM migrated.db.table;
Provides coverage for Databricks platform migrations.