Databricks Job activity and 2025 Azure Data Factory connectors
/plugin marketplace add JosiahSiegel/claude-code-marketplace/plugin install adf-master@claude-plugin-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
MANDATORY: Always Use Backslashes on Windows for File Paths
When using Edit or Write tools on Windows, you MUST use backslashes (\) in file paths, NOT forward slashes (/).
Examples:
D:/repos/project/file.tsxD:\repos\project\file.tsxThis applies to:
NEVER create new documentation files unless explicitly requested by the user.
🚨 CRITICAL UPDATE (2025): The Databricks Job activity is now the ONLY recommended method for orchestrating Databricks in ADF. Microsoft strongly recommends migrating from legacy Notebook, Python, and JAR activities.
Old Pattern (Notebook Activity - ❌ LEGACY):
{
"name": "RunNotebook",
"type": "DatabricksNotebook", // ❌ DEPRECATED - Migrate to DatabricksJob
"linkedServiceName": { "referenceName": "DatabricksLinkedService" },
"typeProperties": {
"notebookPath": "/Users/user@example.com/MyNotebook",
"baseParameters": { "param1": "value1" }
}
}
New Pattern (Databricks Job Activity - ✅ CURRENT 2025):
{
"name": "RunDatabricksWorkflow",
"type": "DatabricksJob", // ✅ CORRECT activity type (NOT DatabricksSparkJob)
"linkedServiceName": { "referenceName": "DatabricksLinkedService" },
"typeProperties": {
"jobId": "123456", // Reference existing Databricks Workflow Job
"jobParameters": { // Pass parameters to the Job
"param1": "value1",
"runDate": "@pipeline().parameters.ProcessingDate"
}
},
"policy": {
"timeout": "0.12:00:00",
"retry": 2,
"retryIntervalInSeconds": 30
}
}
Serverless Execution by Default:
Advanced Workflow Features:
Centralized Job Management:
Better Orchestration:
Improved Reliability:
Cost Optimization:
# In Databricks workspace
# Create Job with tasks
{
"name": "Data Processing Job",
"tasks": [
{
"task_key": "ingest",
"notebook_task": {
"notebook_path": "/Notebooks/Ingest",
"base_parameters": {}
},
"job_cluster_key": "small_cluster"
},
{
"task_key": "transform",
"depends_on": [{ "task_key": "ingest" }],
"notebook_task": {
"notebook_path": "/Notebooks/Transform"
},
"job_cluster_key": "medium_cluster"
},
{
"task_key": "load",
"depends_on": [{ "task_key": "transform" }],
"notebook_task": {
"notebook_path": "/Notebooks/Load"
},
"job_cluster_key": "small_cluster"
}
],
"job_clusters": [
{
"job_cluster_key": "small_cluster",
"new_cluster": {
"spark_version": "13.3.x-scala2.12",
"node_type_id": "Standard_DS3_v2",
"num_workers": 2
}
},
{
"job_cluster_key": "medium_cluster",
"new_cluster": {
"spark_version": "13.3.x-scala2.12",
"node_type_id": "Standard_DS4_v2",
"num_workers": 8
}
}
]
}
# Get Job ID after creation
{
"name": "PL_Databricks_Serverless_Workflow",
"properties": {
"activities": [
{
"name": "ExecuteDatabricksWorkflow",
"type": "DatabricksJob", // ✅ Correct activity type
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 2,
"retryIntervalInSeconds": 30
},
"typeProperties": {
"jobId": "123456", // Databricks Job ID from workspace
"jobParameters": { // ⚠️ Use jobParameters (not parameters)
"input_path": "/mnt/data/input",
"output_path": "/mnt/data/output",
"run_date": "@pipeline().parameters.runDate",
"environment": "@pipeline().parameters.environment"
}
},
"linkedServiceName": {
"referenceName": "DatabricksLinkedService_Serverless",
"type": "LinkedServiceReference"
}
},
{
"name": "LogJobExecution",
"type": "WebActivity",
"dependsOn": [
{
"activity": "ExecuteDatabricksWorkflow",
"dependencyConditions": ["Succeeded"]
}
],
"typeProperties": {
"url": "@pipeline().parameters.LoggingEndpoint",
"method": "POST",
"body": {
"jobId": "123456",
"runId": "@activity('ExecuteDatabricksWorkflow').output.runId",
"status": "Succeeded",
"duration": "@activity('ExecuteDatabricksWorkflow').output.executionDuration"
}
}
}
],
"parameters": {
"runDate": {
"type": "string",
"defaultValue": "@utcnow()"
},
"environment": {
"type": "string",
"defaultValue": "production"
},
"LoggingEndpoint": {
"type": "string"
}
}
}
}
✅ RECOMMENDED: Serverless Linked Service (No Cluster Configuration)
{
"name": "DatabricksLinkedService_Serverless",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "AzureDatabricks",
"typeProperties": {
"domain": "https://adb-123456789.azuredatabricks.net",
"authentication": "MSI" // ✅ Managed Identity (recommended 2025)
// ⚠️ NO existingClusterId or newClusterNodeType needed for serverless!
// The Databricks Job activity automatically uses serverless compute
}
}
}
Alternative: Access Token Authentication
{
"name": "DatabricksLinkedService_Token",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "AzureDatabricks",
"typeProperties": {
"domain": "https://adb-123456789.azuredatabricks.net",
"accessToken": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault",
"type": "LinkedServiceReference"
},
"secretName": "databricks-access-token"
}
}
}
}
🚨 CRITICAL: For Databricks Job activity, DO NOT specify cluster properties in the linked service. The job configuration in Databricks workspace controls compute resources.
🚨 CRITICAL: ServiceNow V1 connector is at End of Support stage. Migrate to V2 immediately!
Key Features of V2:
Copy Activity Example:
{
"name": "CopyFromServiceNowV2",
"type": "Copy",
"inputs": [
{
"referenceName": "ServiceNowV2Source",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "AzureSqlSink",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"type": "ServiceNowV2Source",
"query": "sysparm_query=active=true^priority=1^sys_created_on>=javascript:gs.dateGenerate('2025-01-01')",
"httpRequestTimeout": "00:01:40" // 100 seconds
},
"sink": {
"type": "AzureSqlSink",
"writeBehavior": "upsert",
"upsertSettings": {
"useTempDB": true,
"keys": ["sys_id"]
}
},
"enableStaging": true,
"stagingSettings": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage",
"type": "LinkedServiceReference"
}
}
}
}
Linked Service (OAuth2 - Recommended):
{
"name": "ServiceNowV2LinkedService",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "ServiceNowV2",
"typeProperties": {
"endpoint": "https://dev12345.service-now.com",
"authenticationType": "OAuth2",
"clientId": "your-oauth-client-id",
"clientSecret": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault",
"type": "LinkedServiceReference"
},
"secretName": "servicenow-client-secret"
},
"username": "service-account@company.com",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault",
"type": "LinkedServiceReference"
},
"secretName": "servicenow-password"
},
"grantType": "password"
}
}
}
Linked Service (Basic Authentication - Legacy):
{
"name": "ServiceNowV2LinkedService_Basic",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "ServiceNowV2",
"typeProperties": {
"endpoint": "https://dev12345.service-now.com",
"authenticationType": "Basic",
"username": "admin",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault",
"type": "LinkedServiceReference"
},
"secretName": "servicenow-password"
}
}
}
}
Migration from V1 to V2:
ServiceNow to ServiceNowV2ServiceNowSource to ServiceNowV2SourceImproved performance and features:
{
"name": "PostgreSQLLinkedService",
"type": "PostgreSql",
"typeProperties": {
"connectionString": "host=myserver.postgres.database.azure.com;port=5432;database=mydb;uid=myuser",
"password": {
"type": "AzureKeyVaultSecret",
"store": { "referenceName": "KeyVault" },
"secretName": "postgres-password"
},
// 2025 enhancement
"enableSsl": true,
"sslMode": "Require"
}
}
🆕 Native support for Microsoft Fabric Warehouse (Q3 2024+)
Supported Activities:
Linked Service Configuration:
{
"name": "FabricWarehouseLinkedService",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "Warehouse", // ✅ NEW dedicated Fabric Warehouse type
"typeProperties": {
"endpoint": "myworkspace.datawarehouse.fabric.microsoft.com",
"warehouse": "MyWarehouse",
"authenticationType": "ServicePrincipal", // Recommended
"servicePrincipalId": "<app-registration-id>",
"servicePrincipalKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault",
"type": "LinkedServiceReference"
},
"secretName": "fabric-warehouse-sp-key"
},
"tenant": "<tenant-id>"
}
}
}
Alternative: Managed Identity Authentication (Preferred)
{
"name": "FabricWarehouseLinkedService_ManagedIdentity",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "Warehouse",
"typeProperties": {
"endpoint": "myworkspace.datawarehouse.fabric.microsoft.com",
"warehouse": "MyWarehouse",
"authenticationType": "SystemAssignedManagedIdentity"
}
}
}
Copy Activity Example:
{
"name": "CopyToFabricWarehouse",
"type": "Copy",
"inputs": [
{
"referenceName": "AzureSqlSource",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "FabricWarehouseSink",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"type": "AzureSqlSource"
},
"sink": {
"type": "WarehouseSink",
"writeBehavior": "insert", // or "upsert"
"writeBatchSize": 10000,
"tableOption": "autoCreate" // Auto-create table if not exists
},
"enableStaging": true, // Recommended for large data
"stagingSettings": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage",
"type": "LinkedServiceReference"
},
"path": "staging/fabric-warehouse"
},
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": { "name": "CustomerID" },
"sink": { "name": "customer_id" }
}
]
}
}
}
Best Practices for Fabric Warehouse:
tableOption: autoCreate for dynamic schema creationImproved performance:
{
"name": "SnowflakeLinkedService",
"type": "Snowflake",
"typeProperties": {
"connectionString": "jdbc:snowflake://myaccount.snowflakecomputing.com",
"database": "mydb",
"warehouse": "mywarehouse",
"authenticationType": "KeyPair",
"username": "myuser",
"privateKey": {
"type": "AzureKeyVaultSecret",
"store": { "referenceName": "KeyVault" },
"secretName": "snowflake-private-key"
},
"privateKeyPassphrase": {
"type": "AzureKeyVaultSecret",
"store": { "referenceName": "KeyVault" },
"secretName": "snowflake-passphrase"
}
}
}
Now supports system-assigned and user-assigned managed identity:
{
"name": "AzureTableStorageLinkedService",
"type": "AzureTableStorage",
"typeProperties": {
"serviceEndpoint": "https://mystorageaccount.table.core.windows.net",
"authenticationType": "ManagedIdentity" // New in 2025
// Or user-assigned:
// "credential": {
// "referenceName": "UserAssignedManagedIdentity"
// }
}
}
Now supports managed identity authentication:
{
"name": "AzureFilesLinkedService",
"type": "AzureFileStorage",
"typeProperties": {
"fileShare": "myshare",
"accountName": "mystorageaccount",
"authenticationType": "ManagedIdentity" // New in 2025
}
}
Spark 3.3 now powers Mapping Data Flows:
Performance Improvements:
New Features:
{
"name": "DataFlow1",
"type": "MappingDataFlow",
"typeProperties": {
"sources": [
{
"dataset": { "referenceName": "SourceDataset" }
}
],
"transformations": [
{
"name": "Transform1"
}
],
"sinks": [
{
"dataset": { "referenceName": "SinkDataset" }
}
]
}
}
Git integration now supports on-premises Azure DevOps Server 2022:
{
"name": "DataFactory",
"properties": {
"repoConfiguration": {
"type": "AzureDevOpsGit",
"accountName": "on-prem-ado-server",
"projectName": "MyProject",
"repositoryName": "adf-repo",
"collaborationBranch": "main",
"rootFolder": "/",
"hostName": "https://ado-server.company.com" // On-premises server
}
}
}
System-Assigned Managed Identity:
{
"type": "AzureBlobStorage",
"typeProperties": {
"serviceEndpoint": "https://mystorageaccount.blob.core.windows.net",
"accountKind": "StorageV2"
// ✅ Uses Data Factory's system-assigned identity automatically
}
}
User-Assigned Managed Identity (NEW 2025):
{
"type": "AzureBlobStorage",
"typeProperties": {
"serviceEndpoint": "https://mystorageaccount.blob.core.windows.net",
"accountKind": "StorageV2",
"credential": {
"referenceName": "UserAssignedManagedIdentityCredential",
"type": "CredentialReference"
}
}
}
When to Use User-Assigned:
Credential Consolidation (NEW 2025):
ADF now supports a centralized Credentials feature:
{
"name": "ManagedIdentityCredential",
"type": "Microsoft.DataFactory/factories/credentials",
"properties": {
"type": "ManagedIdentity",
"typeProperties": {
"resourceId": "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{identity-name}"
}
}
}
Benefits:
🚨 IMPORTANT: Azure requires MFA for all users by October 2025
Impact on ADF:
Best Practice:
{
"type": "AzureSqlDatabase",
"typeProperties": {
"server": "myserver.database.windows.net",
"database": "mydb",
"authenticationType": "SystemAssignedManagedIdentity"
// ✅ No MFA needed, no secret rotation, passwordless
}
}
Storage Blob Data Roles:
Storage Blob Data Reader - Read-only access (source)Storage Blob Data Contributor - Read/write access (sink)Storage Blob Data Owner unless neededSQL Database Roles:
-- Create contained database user for managed identity
CREATE USER [datafactory-name] FROM EXTERNAL PROVIDER;
-- Grant minimal required permissions
ALTER ROLE db_datareader ADD MEMBER [datafactory-name];
ALTER ROLE db_datawriter ADD MEMBER [datafactory-name];
-- ❌ Avoid db_owner unless truly needed
Key Vault Access Policies:
{
"permissions": {
"secrets": ["Get"] // ✅ Only Get permission needed
// ❌ Don't grant List, Set, Delete unless required
}
}
Use Databricks Job Activity (MANDATORY):
Managed Identity Authentication (MANDATORY 2025):
Monitor Job Execution:
Optimize Spark 3.3 Usage (Data Flows):
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.