Skill

databricks-app-python

Builds Python Databricks apps with Streamlit, Dash, Gradio, Flask, FastAPI, or Reflex, handling OAuth, SQL/Lakebase connectivity, model serving, LLMs, and deployment.

Python

Flask

Fastapi

data-engineering

backend

Install

npx claudepluginhub databricks-solutions/ai-dev-kit --plugin databricks-ai-dev-kit

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Build Python-based Databricks applications. For full examples and recipes, see the **[Databricks Apps Cookbook](https://apps-cookbook.dev/)**.

Supporting Assets

1-authorization.md2-app-resources.md3-frameworks.md4-deployment.md5-lakebase.md6-mcp-approach.mdexamples/fm-minimal-chat.pyexamples/fm-parallel-calls.pyexamples/fm-structured-outputs.pyexamples/llm_config.py

SKILL.md

Similar Skills

kotlin-ktor-patterns

Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.

everything-claude-code

163.2k

deep-research

Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.

everything-claude-code

163.2k

inventory-demand-planning

Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.

everything-claude-code

163.2k

Stats

Stars1254

Forks262

Last CommitMar 17, 2026

Actions

View Source View Plugin View on GitHub View README

Databricks Python Application

Build Python-based Databricks applications. For full examples and recipes, see the Databricks Apps Cookbook.

Critical Rules (always follow)

MUST confirm framework choice or use Framework Selection below
MUST use SDK Config() for authentication (never hardcode tokens)
MUST use app.yaml valueFrom for resources (never hardcode resource IDs)
MUST use dash-bootstrap-components for Dash app layout and styling
MUST use @st.cache_resource for Streamlit database connections
MUST deploy Flask with Gunicorn, FastAPI with uvicorn (not dev servers)

Required Steps

Copy this checklist and verify each item:

- [ ] Framework selected
- [ ] Auth strategy decided: app auth, user auth, or both
- [ ] App resources identified (SQL warehouse, Lakebase, serving endpoint, etc.)
- [ ] Backend data strategy decided (SQL warehouse, Lakebase, or SDK)
- [ ] Deployment method: CLI or DABs

Framework Selection

Framework	Best For	app.yaml Command
Dash	Production dashboards, BI tools, complex interactivity	`["python", "app.py"]`
Streamlit	Rapid prototyping, data science apps, internal tools	`["streamlit", "run", "app.py"]`
Gradio	ML demos, model interfaces, chat UIs	`["python", "app.py"]`
Flask	Custom REST APIs, lightweight apps, webhooks	`["gunicorn", "app:app", "-w", "4", "-b", "0.0.0.0:8000"]`
FastAPI	Async APIs, auto-generated OpenAPI docs	`["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]`
Reflex	Full-stack Python apps without JavaScript	`["reflex", "run", "--env", "prod"]`

Default: Recommend Streamlit for prototypes, Dash for production dashboards, FastAPI for APIs, Gradio for ML demos.

Quick Reference

Concept	Details
Runtime	Python 3.11, Ubuntu 22.04, 2 vCPU, 6 GB RAM
Pre-installed	Dash 2.18.1, Streamlit 1.38.0, Gradio 4.44.0, Flask 3.0.3, FastAPI 0.115.0
Auth (app)	Service principal via `Config()` — auto-injected `DATABRICKS_CLIENT_ID`/`DATABRICKS_CLIENT_SECRET`
Auth (user)	`x-forwarded-access-token` header — see 1-authorization.md
Resources	`valueFrom` in app.yaml — see 2-app-resources.md
Cookbook	https://apps-cookbook.dev/
Docs	https://docs.databricks.com/aws/en/dev-tools/databricks-apps/

Detailed Guides

Authorization: Use 1-authorization.md when configuring app or user authorization — covers service principal auth, on-behalf-of user tokens, OAuth scopes, and per-framework code examples. (Keywords: OAuth, service principal, user auth, on-behalf-of, access token, scopes)

App resources: Use 2-app-resources.md when connecting your app to Databricks resources — covers SQL warehouses, Lakebase, model serving, secrets, volumes, and the valueFrom pattern. (Keywords: resources, valueFrom, SQL warehouse, model serving, secrets, volumes, connections)

Frameworks: See 3-frameworks.md for Databricks-specific patterns per framework — covers Dash, Streamlit, Gradio, Flask, FastAPI, and Reflex with auth integration, deployment commands, and Cookbook links. (Keywords: Dash, Streamlit, Gradio, Flask, FastAPI, Reflex, framework selection)

Deployment: Use 4-deployment.md when deploying your app — covers Databricks CLI, Asset Bundles (DABs), app.yaml configuration, and post-deployment verification. (Keywords: deploy, CLI, DABs, asset bundles, app.yaml, logs)

Lakebase: Use 5-lakebase.md when using Lakebase (PostgreSQL) as your app's data layer — covers auto-injected env vars, psycopg2/asyncpg patterns, and when to choose Lakebase vs SQL warehouse. (Keywords: Lakebase, PostgreSQL, psycopg2, asyncpg, transactional, PGHOST)

MCP tools: Use 6-mcp-approach.md for managing app lifecycle via MCP tools — covers creating, deploying, monitoring, and deleting apps programmatically. (Keywords: MCP, create app, deploy app, app logs)

Foundation Models: See examples/llm_config.py for calling Databricks foundation model APIs — covers OAuth M2M auth, OpenAI-compatible client wiring, and token caching. (Keywords: foundation model, LLM, OpenAI client, chat completions)

Workflow

Determine the task type:

New app from scratch? → Use Framework Selection, then read 3-frameworks.md Setting up authorization? → Read 1-authorization.md Connecting to data/resources? → Read 2-app-resources.md Using Lakebase (PostgreSQL)? → Read 5-lakebase.md Deploying to Databricks? → Read 4-deployment.md Using MCP tools? → Read 6-mcp-approach.md Calling foundation model/LLM APIs? → See examples/llm_config.py
Follow the instructions in the relevant guide
For full code examples, browse https://apps-cookbook.dev/

Core Architecture

All Python Databricks apps follow this pattern:

app-directory/
├── app.py                 # Main application (or framework-specific name)
├── models.py              # Pydantic data models
├── backend.py             # Data access layer
├── requirements.txt       # Additional Python dependencies
├── app.yaml               # Databricks Apps configuration
└── README.md

Backend Toggle Pattern

import os
from databricks.sdk.core import Config

USE_MOCK = os.getenv("USE_MOCK_BACKEND", "true").lower() == "true"

if USE_MOCK:
    from backend_mock import MockBackend as Backend
else:
    from backend_real import RealBackend as Backend

backend = Backend()

SQL Warehouse Connection (shared across all frameworks)

from databricks.sdk.core import Config
from databricks import sql

cfg = Config()  # Auto-detects credentials from environment
conn = sql.connect(
    server_hostname=cfg.host,
    http_path=f"/sql/1.0/warehouses/{os.getenv('DATABRICKS_WAREHOUSE_ID')}",
    credentials_provider=lambda: cfg.authenticate,
)

Pydantic Models

from pydantic import BaseModel, Field
from datetime import datetime
from enum import Enum

class Status(str, Enum):
    ACTIVE = "active"
    PENDING = "pending"

class EntityOut(BaseModel):
    id: str
    name: str
    status: Status
    created_at: datetime

class EntityIn(BaseModel):
    name: str = Field(..., min_length=1)
    status: Status = Status.PENDING

Common Issues

Issue	Solution
Connection exhausted	Use `@st.cache_resource` (Streamlit) or connection pooling
Auth token not found	Check `x-forwarded-access-token` header — only available when deployed, not locally
App won't start	Check `app.yaml` command matches framework; check `databricks apps logs <name>`
Resource not accessible	Add resource via UI, verify SP has permissions, use `valueFrom` in app.yaml
Import error on deploy	Add missing packages to `requirements.txt` (pre-installed packages don't need listing)
Lakebase app crashes on start	`psycopg2`/`asyncpg` are NOT pre-installed — MUST add to `requirements.txt`
Port conflict	Apps must bind to `DATABRICKS_APP_PORT` env var (defaults to 8000). Never use 8080. Streamlit is auto-configured; for others, read the env var in code or use 8000 in app.yaml command
Streamlit: set_page_config error	`st.set_page_config()` must be the first Streamlit command
Dash: unstyled layout	Add `dash-bootstrap-components`; use `dbc.themes.BOOTSTRAP`
Slow queries	Use Lakebase for transactional/low-latency; SQL warehouse for analytical queries

Platform Constraints

Constraint	Details
Runtime	Python 3.11, Ubuntu 22.04 LTS
Compute	2 vCPUs, 6 GB memory (default)
Pre-installed frameworks	Dash, Streamlit, Gradio, Flask, FastAPI, Shiny
Custom packages	Add to `requirements.txt` in app root
Network	Apps can reach Databricks APIs; external access depends on workspace config
User auth	Public Preview — workspace admin must enable before adding scopes

Official Documentation

Databricks Apps Overview — main docs hub
Apps Cookbook — ready-to-use code snippets (Streamlit, Dash, Reflex, FastAPI)
Authorization — app auth and user auth
Resources — SQL warehouse, Lakebase, serving, secrets
app.yaml Reference — command and env config
System Environment — pre-installed packages, runtime details

Related Skills

databricks-app-apx - full-stack apps with FastAPI + React
databricks-bundles - deploying apps via DABs
databricks-python-sdk - backend SDK integration
databricks-lakebase-provisioned - adding persistent PostgreSQL state
databricks-model-serving - serving ML models for app integration