Create a minimal working Databricks example with cluster and notebook. Use when starting a new Databricks project, testing your setup, or learning basic Databricks patterns. Trigger with phrases like "databricks hello world", "databricks example", "databricks quick start", "first databricks notebook", "create cluster".
From databricks-packnpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin databricks-packThis skill is limited to using the following tools:
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Designs, audits, and improves analytics tracking systems using Signal Quality Index for reliable, decision-ready data in marketing, product, and growth.
Enforces A/B test setup with gates for hypothesis locking, metrics definition, sample size calculation, assumptions checks, and execution readiness before implementation.
Create your first Databricks cluster and notebook to verify setup.
databricks-install-auth setup# Create a small development cluster via CLI
databricks clusters create --json '{
"cluster_name": "hello-world-cluster",
"spark_version": "14.3.x-scala2.12",
"node_type_id": "Standard_DS3_v2",
"autotermination_minutes": 30,
"num_workers": 0,
"spark_conf": {
"spark.databricks.cluster.profile": "singleNode",
"spark.master": "local[*]"
},
"custom_tags": {
"ResourceClass": "SingleNode"
}
}'
# hello_world.py - upload as notebook
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
# Create notebook content
notebook_content = """
# Databricks Hello World
# COMMAND ----------
# Simple DataFrame operations
data = [("Alice", 28), ("Bob", 35), ("Charlie", 42)]
df = spark.createDataFrame(data, ["name", "age"])
display(df)
# COMMAND ----------
# Delta Lake example
df.write.format("delta").mode("overwrite").save("/tmp/hello_world_delta")
# COMMAND ----------
# Read it back
df_read = spark.read.format("delta").load("/tmp/hello_world_delta")
display(df_read)
# COMMAND ----------
print("Hello from Databricks!")
"""
import base64
w.workspace.import_(
path="/Users/your-email/hello_world",
format="SOURCE",
language="PYTHON",
content=base64.b64encode(notebook_content.encode()).decode(),
overwrite=True
)
print("Notebook created!")
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.jobs import Task, NotebookTask, RunNow
w = WorkspaceClient()
# Create a one-time run
run = w.jobs.submit(
run_name="hello-world-run",
tasks=[
Task(
task_key="hello",
existing_cluster_id="your-cluster-id",
notebook_task=NotebookTask(
notebook_path="/Users/your-email/hello_world"
)
)
]
)
# Wait for completion
result = w.jobs.get_run(run.response.run_id).result()
print(f"Run completed with state: {result.state.result_state}")
# List clusters
databricks clusters list
# Get cluster status
databricks clusters get --cluster-id your-cluster-id
# List workspace contents
databricks workspace list /Users/your-email/
# Get run output
databricks runs get-output --run-id your-run-id
/tmp/hello_world_delta| Error | Cause | Solution |
|---|---|---|
Cluster quota exceeded | Workspace limits | Terminate unused clusters |
Invalid node type | Wrong instance type | Check available node types |
Notebook path exists | Duplicate path | Use overwrite=True |
Cluster pending | Startup in progress | Wait for RUNNING state |
Permission denied | Insufficient privileges | Request workspace admin access |
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.compute import ClusterSpec
w = WorkspaceClient()
# Create single-node cluster for development
cluster = w.clusters.create_and_wait(
cluster_name="dev-cluster",
spark_version="14.3.x-scala2.12",
node_type_id="Standard_DS3_v2",
num_workers=0,
autotermination_minutes=30,
spark_conf={
"spark.databricks.cluster.profile": "singleNode",
"spark.master": "local[*]"
}
)
print(f"Cluster created: {cluster.cluster_id}")
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
# Create SQL warehouse for queries
warehouse = w.warehouses.create_and_wait(
name="hello-warehouse",
cluster_size="2X-Small",
auto_stop_mins=15,
warehouse_type="PRO",
enable_serverless_compute=True
)
print(f"Warehouse created: {warehouse.id}")
# Run in notebook or Databricks Connect
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
# Create sample data
df = spark.range(1000).toDF("id") # 1000: 1 second in ms
df = df.withColumn("value", df.id * 2)
# Show results
df.show(5)
print(f"Row count: {df.count()}")
Proceed to databricks-local-dev-loop for local development setup.