Help us improve
Share bugs, ideas, or general feedback.
Executes Python, Scala, SQL, R code on Databricks via Connect, serverless jobs, or interactive clusters. Manages clusters and warehouses: create, resize, start, terminate, delete.
npx claudepluginhub databricks-solutions/ai-dev-kit --plugin databricks-ai-dev-kitHow this skill is triggered — by the user, by Claude, or both
Slash command
/databricks-ai-dev-kit:databricks-execution-computeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run code on Databricks. Three execution modes—choose based on workload.
Creates minimal Databricks single-node cluster and Spark notebook via REST API, CLI, or Python SDK. For new projects, setup testing, and basic Delta Lake patterns.
Guides Databricks development with Python SDK, Databricks Connect for local Spark, CLI, and REST API. Use for databricks-sdk, databricks-connect, or APIs.
Migrates Databricks workloads from classic compute to serverless compute. Scans code for compatibility issues like RDDs and DBFS, provides Spark Connect fixes, and guides migration for notebooks and jobs.
Share bugs, ideas, or general feedback.
Run code on Databricks. Three execution modes—choose based on workload.
| Aspect | Databricks Connect ⭐ | Serverless Job | Interactive Cluster |
|---|---|---|---|
| Use for | Spark code (ETL, data gen) | Heavy processing (ML) | State across tool calls, Scala/R |
| Startup | Instant | ~25-50s cold start | ~5min if stopped |
| State | Within Python process | None | Via context_id |
| Languages | Python (PySpark) | Python, SQL | Python, Scala, SQL, R |
| Dependencies | withDependencies() | CLI with environments spec | Install on cluster |
Spark-based code? → Databricks Connect (fastest)
└─ Python 3.12 missing? → Install it + databricks-connect
└─ Install fails? → Ask user (don't auto-switch modes)
Heavy/long-running (ML)? → Serverless Job (independent)
Need state across calls? → Interactive Cluster (list and ask which one to use)
Scala/R? → Interactive Cluster (list and ask which one to use)
Read the reference file for your chosen mode before proceeding.
python my_spark_script.py
execute_code(file_path="/path/to/script.py")
# Check for running clusters first (or use the one instructed)
list_compute(resource="clusters")
# Ask the customer which one to use
# Run code, reuse context_id for follow-up MCP call
result = execute_code(code="...", compute_type="cluster", cluster_id="...")
execute_code(code="...", context_id=result["context_id"], cluster_id=result["cluster_id"])
| Tool | For | Purpose |
|---|---|---|
execute_code | Serverless, Interactive | Run code remotely |
list_compute | Interactive | List clusters, check status, auto-select running cluster |
manage_cluster | Interactive | Create, start, terminate, delete. COSTLY: start takes 3-8 min—ask user |
manage_sql_warehouse | SQL | Create, modify, delete SQL warehouses |