Patterns and best practices for using Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads.
Provides patterns and best practices for using Databricks managed PostgreSQL for OLTP workloads and applications.
/plugin marketplace add https://www.claudepluginhub.com/api/plugins/databricks-solutions-databricks-ai-dev-kit/marketplace.json/plugin install databricks-solutions-databricks-ai-dev-kit@cpd-databricks-solutions-databricks-ai-dev-kitThis skill inherits all available tools. When active, it can use any tool Claude has access to.
connection-patterns.mdreverse-etl.mdPatterns and best practices for using Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads.
Use this skill when:
Lakebase Provisioned is Databricks' managed PostgreSQL database service for OLTP (Online Transaction Processing) workloads. It provides a fully managed PostgreSQL-compatible database that integrates with Unity Catalog and supports OAuth token-based authentication.
| Feature | Description |
|---|---|
| Managed PostgreSQL | Fully managed instances with automatic provisioning |
| OAuth Authentication | Token-based auth via Databricks SDK (1-hour expiry) |
| Unity Catalog | Register databases for governance |
| Reverse ETL | Sync data from Delta tables to PostgreSQL |
| Apps Integration | First-class support in Databricks Apps |
Available Regions (AWS): us-east-1, us-east-2, us-west-2, eu-central-1, eu-west-1, ap-south-1, ap-southeast-1, ap-southeast-2
Create and connect to a Lakebase Provisioned instance:
from databricks.sdk import WorkspaceClient
import uuid
# Initialize client
w = WorkspaceClient()
# Create a database instance
instance = w.database.create_database_instance(
name="my-lakebase-instance",
capacity="CU_1", # CU_1, CU_2, CU_4, CU_8
stopped=False
)
print(f"Instance created: {instance.name}")
print(f"DNS endpoint: {instance.read_write_dns}")
from databricks.sdk import WorkspaceClient
import uuid
w = WorkspaceClient()
# Generate OAuth token for database connection
cred = w.database.generate_database_credential(
request_id=str(uuid.uuid4()),
instance_names=["my-lakebase-instance"]
)
token = cred.token # Use this as password in connection string
import psycopg
from databricks.sdk import WorkspaceClient
import uuid
# Get instance details
w = WorkspaceClient()
instance = w.database.get_database_instance(name="my-lakebase-instance")
# Generate token
cred = w.database.generate_database_credential(
request_id=str(uuid.uuid4()),
instance_names=["my-lakebase-instance"]
)
# Connect using psycopg3
conn_string = f"host={instance.read_write_dns} dbname=postgres user={w.current_user.me().user_name} password={cred.token} sslmode=require"
with psycopg.connect(conn_string) as conn:
with conn.cursor() as cur:
cur.execute("SELECT version()")
print(cur.fetchone())
For long-running applications, tokens must be refreshed (expire after 1 hour):
import asyncio
import os
import uuid
from sqlalchemy import event
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker
from databricks.sdk import WorkspaceClient
# Token refresh state
_current_token = None
_token_refresh_task = None
TOKEN_REFRESH_INTERVAL = 50 * 60 # 50 minutes (before 1-hour expiry)
def _generate_token(instance_name: str) -> str:
"""Generate fresh OAuth token."""
w = WorkspaceClient()
cred = w.database.generate_database_credential(
request_id=str(uuid.uuid4()),
instance_names=[instance_name]
)
return cred.token
async def _token_refresh_loop(instance_name: str):
"""Background task to refresh token every 50 minutes."""
global _current_token
while True:
await asyncio.sleep(TOKEN_REFRESH_INTERVAL)
_current_token = await asyncio.to_thread(_generate_token, instance_name)
def init_database(instance_name: str, database_name: str, username: str) -> AsyncEngine:
"""Initialize database with OAuth token injection."""
global _current_token
w = WorkspaceClient()
instance = w.database.get_database_instance(name=instance_name)
# Generate initial token
_current_token = _generate_token(instance_name)
# Build URL (password injected via do_connect)
url = f"postgresql+psycopg://{username}@{instance.read_write_dns}:5432/{database_name}"
engine = create_async_engine(
url,
pool_size=5,
max_overflow=10,
pool_recycle=3600,
connect_args={"sslmode": "require"}
)
# Inject token on each connection
@event.listens_for(engine.sync_engine, "do_connect")
def provide_token(dialect, conn_rec, cargs, cparams):
cparams["password"] = _current_token
return engine
For Databricks Apps, use environment variables for configuration:
# Environment variables set by Databricks Apps:
# - LAKEBASE_INSTANCE_NAME: Instance name
# - LAKEBASE_DATABASE_NAME: Database name
# - LAKEBASE_USERNAME: Username (optional, defaults to service principal)
import os
def is_lakebase_configured() -> bool:
"""Check if Lakebase is configured for this app."""
return bool(
os.environ.get("LAKEBASE_PG_URL") or
(os.environ.get("LAKEBASE_INSTANCE_NAME") and
os.environ.get("LAKEBASE_DATABASE_NAME"))
)
Add Lakebase as an app resource via CLI:
databricks apps add-resource $APP_NAME \
--resource-type database \
--resource-name lakebase \
--database-instance my-lakebase-instance
from databricks.sdk import WorkspaceClient
w = WorkspaceClient()
# Register database in Unity Catalog
w.database.register_database_instance(
name="my-lakebase-instance",
catalog="my_catalog",
schema="my_schema"
)
Declare Lakebase as a model resource for automatic credential provisioning:
from mlflow.models.resources import DatabricksLakebase
resources = [
DatabricksLakebase(database_instance_name="my-lakebase-instance"),
]
# When logging model
mlflow.langchain.log_model(
model,
artifact_path="model",
resources=resources,
pip_requirements=["databricks-langchain[memory]"]
)
The following MCP tools are available for managing Lakebase infrastructure. Use type="provisioned" for Lakebase Provisioned.
| Tool | Description |
|---|---|
create_or_update_lakebase_database | Create or update a database. Finds by name, creates if new, updates if existing. Use type="provisioned", capacity (CU_1-CU_8), stopped params. |
get_lakebase_database | Get database details or list all. Pass name to get one, omit to list all. Use type="provisioned" to filter. |
delete_lakebase_database | Delete a database and its resources. Use type="provisioned", force=True to cascade. |
generate_lakebase_credential | Generate OAuth token for PostgreSQL connections (1-hour expiry). Pass instance_names for provisioned. |
| Tool | Description |
|---|---|
create_or_update_lakebase_sync | Set up reverse ETL: ensures UC catalog registration exists, then creates a synced table from Delta to Lakebase. Params: instance_name, source_table_name, target_table_name, scheduling_policy ("TRIGGERED"/"SNAPSHOT"/"CONTINUOUS"). |
delete_lakebase_sync | Remove a synced table and optionally its UC catalog registration. |
# Create instance
databricks database create-database-instance \
--name my-lakebase-instance \
--capacity CU_1
# Get instance details
databricks database get-database-instance --name my-lakebase-instance
# Generate credentials
databricks database generate-database-credential \
--request-id $(uuidgen) \
--json '{"instance_names": ["my-lakebase-instance"]}'
# List instances
databricks database list-database-instances
# Stop instance (saves cost)
databricks database stop-database-instance --name my-lakebase-instance
# Start instance
databricks database start-database-instance --name my-lakebase-instance
| Issue | Solution |
|---|---|
| Token expired during long query | Implement token refresh loop (see SQLAlchemy with Token Refresh section); tokens expire after 1 hour |
| DNS resolution fails on macOS | Use dig command to resolve hostname, pass hostaddr to psycopg |
| Connection refused | Ensure instance is not stopped; check instance.state |
| Permission denied | User must be granted access to the Lakebase instance |
| SSL required error | Always use sslmode=require in connection string |
hostaddr parameter for DNS workaround)postgresql+psycopg driver%pip install -U "databricks-sdk>=0.81.0" "psycopg[binary]>=3.0" sqlalchemy
CU_1, CU_2, CU_4, CU_8.databricks-langchain[memory] which includes Lakebase support.Activates when the user asks about AI prompts, needs prompt templates, wants to search for prompts, or mentions prompts.chat. Use for discovering, retrieving, and improving prompts.
Search, retrieve, and install Agent Skills from the prompts.chat registry using MCP tools. Use when the user asks to find skills, browse skill catalogs, install a skill for Claude, or extend Claude's capabilities with reusable AI agent components.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.