Configure Databricks local development with dbx, Databricks Connect, and IDE. Use when setting up a local dev environment, configuring test workflows, or establishing a fast iteration cycle with Databricks. Trigger with phrases like "databricks dev setup", "databricks local", "databricks IDE", "develop with databricks", "databricks connect".
From databricks-packnpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin databricks-packThis skill is limited to using the following tools:
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Designs, audits, and improves analytics tracking systems using Signal Quality Index for reliable, decision-ready data in marketing, product, and growth.
Enforces A/B test setup with gates for hypothesis locking, metrics definition, sample size calculation, assumptions checks, and execution readiness before implementation.
Set up a fast, reproducible local development workflow for Databricks.
databricks-install-auth setupmy-databricks-project/
├── src/
│ ├── __init__.py
│ ├── pipelines/
│ │ ├── __init__.py
│ │ ├── bronze.py # Raw data ingestion
│ │ ├── silver.py # Data cleansing
│ │ └── gold.py # Business aggregations
│ └── utils/
│ ├── __init__.py
│ └── helpers.py
├── tests/
│ ├── __init__.py
│ ├── unit/
│ │ └── test_helpers.py
│ └── integration/
│ └── test_pipelines.py
├── notebooks/ # Databricks notebooks
│ └── exploration.py
├── resources/ # Asset Bundle configs
│ └── jobs.yml
├── databricks.yml # Asset Bundle project config
├── .env.local # Local secrets (git-ignored)
├── .env.example # Template for team
├── pyproject.toml
└── requirements.txt
set -euo pipefail
# Install Databricks SDK and CLI
pip install databricks-sdk databricks-cli
# Install dbx for deployment
pip install dbx
# Install Databricks Connect v2 (for local Spark)
pip install databricks-connect==14.3.*
# Install testing tools
pip install pytest pytest-cov
# Configure Databricks Connect for local development
databricks-connect configure
# Or set environment variables
export DATABRICKS_HOST="https://adb-1234567890.1.azuredatabricks.net"
export DATABRICKS_TOKEN="dapi..."
export DATABRICKS_CLUSTER_ID="1234-567890-abcde123" # 567890: port 1234 - example/test
# databricks.yml
bundle:
name: my-databricks-project
workspace:
host: ${DATABRICKS_HOST}
variables:
catalog:
description: Unity Catalog name
default: main
schema:
description: Schema name
default: default
targets:
dev:
default: true
mode: development
workspace:
root_path: /Users/${workspace.current_user.userName}/.bundle/${bundle.name}/dev
staging:
mode: development
workspace:
root_path: /Shared/.bundle/${bundle.name}/staging
prod:
mode: production
workspace:
root_path: /Shared/.bundle/${bundle.name}/prod
# tests/conftest.py
import pytest
from pyspark.sql import SparkSession
@pytest.fixture(scope="session")
def spark():
"""Create local SparkSession for unit tests."""
return SparkSession.builder \
.master("local[*]") \
.appName("unit-tests") \
.config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
.config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \
.getOrCreate()
@pytest.fixture(scope="session")
def dbx_spark():
"""Connect to Databricks cluster for integration tests."""
from databricks.connect import DatabricksSession
return DatabricksSession.builder.getOrCreate()
// .vscode/settings.json
{
"python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python",
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": ["tests"],
"python.linting.enabled": true,
"python.linting.pylintEnabled": true,
"editor.formatOnSave": true,
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter"
},
"databricks.python.envFile": "${workspaceFolder}/.env.local"
}
// .vscode/launch.json
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File (Databricks Connect)",
"type": "python",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal",
"env": {
"DATABRICKS_HOST": "${env:DATABRICKS_HOST}",
"DATABRICKS_TOKEN": "${env:DATABRICKS_TOKEN}",
"DATABRICKS_CLUSTER_ID": "${env:DATABRICKS_CLUSTER_ID}"
}
}
]
}
| Error | Cause | Solution |
|---|---|---|
Cluster not running | Auto-terminated | Start cluster first |
Version mismatch | DBR vs Connect version | Match databricks-connect version to DBR |
Module not found | Missing local install | Run pip install -e . |
Connection timeout | Network/firewall | Check VPN and firewall rules |
SparkSession already exists | Multiple sessions | Use getOrCreate() pattern |
# Unit tests (local Spark)
pytest tests/unit/ -v
# Integration tests (Databricks Connect)
pytest tests/integration/ -v --tb=short
# With coverage
pytest tests/ --cov=src --cov-report=html
# Validate bundle
databricks bundle validate
# Deploy to dev
databricks bundle deploy -t dev
# Run job
databricks bundle run -t dev my-job
# src/pipelines/bronze.py
from pyspark.sql import SparkSession, DataFrame
def ingest_raw_data(spark: SparkSession, source_path: str) -> DataFrame:
"""Ingest raw data from source."""
return spark.read.format("json").load(source_path)
if __name__ == "__main__":
# Works locally with Databricks Connect
from databricks.connect import DatabricksSession
spark = DatabricksSession.builder.getOrCreate()
df = ingest_raw_data(spark, "/mnt/raw/events")
df.show()
# Watch for changes and sync
dbx sync --watch
# Or use Asset Bundles
databricks bundle sync -t dev --watch
See databricks-sdk-patterns for production-ready code patterns.