Builds Python Databricks apps with Streamlit, Dash, Gradio, Flask, FastAPI, or Reflex, handling OAuth, SQL/Lakebase connectivity, model serving, LLMs, and deployment.
npx claudepluginhub databricks-solutions/ai-dev-kit --plugin databricks-ai-dev-kitThis skill uses the workspace's default tool permissions.
Build Python-based Databricks applications. For full examples and recipes, see the **[Databricks Apps Cookbook](https://apps-cookbook.dev/)**.
Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.
Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.
Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.
Build Python-based Databricks applications. For full examples and recipes, see the Databricks Apps Cookbook.
Config() for authentication (never hardcode tokens)app.yaml valueFrom for resources (never hardcode resource IDs)dash-bootstrap-components for Dash app layout and styling@st.cache_resource for Streamlit database connectionsCopy this checklist and verify each item:
- [ ] Framework selected
- [ ] Auth strategy decided: app auth, user auth, or both
- [ ] App resources identified (SQL warehouse, Lakebase, serving endpoint, etc.)
- [ ] Backend data strategy decided (SQL warehouse, Lakebase, or SDK)
- [ ] Deployment method: CLI or DABs
| Framework | Best For | app.yaml Command |
|---|---|---|
| Dash | Production dashboards, BI tools, complex interactivity | ["python", "app.py"] |
| Streamlit | Rapid prototyping, data science apps, internal tools | ["streamlit", "run", "app.py"] |
| Gradio | ML demos, model interfaces, chat UIs | ["python", "app.py"] |
| Flask | Custom REST APIs, lightweight apps, webhooks | ["gunicorn", "app:app", "-w", "4", "-b", "0.0.0.0:8000"] |
| FastAPI | Async APIs, auto-generated OpenAPI docs | ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"] |
| Reflex | Full-stack Python apps without JavaScript | ["reflex", "run", "--env", "prod"] |
Default: Recommend Streamlit for prototypes, Dash for production dashboards, FastAPI for APIs, Gradio for ML demos.
| Concept | Details |
|---|---|
| Runtime | Python 3.11, Ubuntu 22.04, 2 vCPU, 6 GB RAM |
| Pre-installed | Dash 2.18.1, Streamlit 1.38.0, Gradio 4.44.0, Flask 3.0.3, FastAPI 0.115.0 |
| Auth (app) | Service principal via Config() — auto-injected DATABRICKS_CLIENT_ID/DATABRICKS_CLIENT_SECRET |
| Auth (user) | x-forwarded-access-token header — see 1-authorization.md |
| Resources | valueFrom in app.yaml — see 2-app-resources.md |
| Cookbook | https://apps-cookbook.dev/ |
| Docs | https://docs.databricks.com/aws/en/dev-tools/databricks-apps/ |
Authorization: Use 1-authorization.md when configuring app or user authorization — covers service principal auth, on-behalf-of user tokens, OAuth scopes, and per-framework code examples. (Keywords: OAuth, service principal, user auth, on-behalf-of, access token, scopes)
App resources: Use 2-app-resources.md when connecting your app to Databricks resources — covers SQL warehouses, Lakebase, model serving, secrets, volumes, and the valueFrom pattern. (Keywords: resources, valueFrom, SQL warehouse, model serving, secrets, volumes, connections)
Frameworks: See 3-frameworks.md for Databricks-specific patterns per framework — covers Dash, Streamlit, Gradio, Flask, FastAPI, and Reflex with auth integration, deployment commands, and Cookbook links. (Keywords: Dash, Streamlit, Gradio, Flask, FastAPI, Reflex, framework selection)
Deployment: Use 4-deployment.md when deploying your app — covers Databricks CLI, Asset Bundles (DABs), app.yaml configuration, and post-deployment verification. (Keywords: deploy, CLI, DABs, asset bundles, app.yaml, logs)
Lakebase: Use 5-lakebase.md when using Lakebase (PostgreSQL) as your app's data layer — covers auto-injected env vars, psycopg2/asyncpg patterns, and when to choose Lakebase vs SQL warehouse. (Keywords: Lakebase, PostgreSQL, psycopg2, asyncpg, transactional, PGHOST)
MCP tools: Use 6-mcp-approach.md for managing app lifecycle via MCP tools — covers creating, deploying, monitoring, and deleting apps programmatically. (Keywords: MCP, create app, deploy app, app logs)
Foundation Models: See examples/llm_config.py for calling Databricks foundation model APIs — covers OAuth M2M auth, OpenAI-compatible client wiring, and token caching. (Keywords: foundation model, LLM, OpenAI client, chat completions)
Determine the task type:
New app from scratch? → Use Framework Selection, then read 3-frameworks.md Setting up authorization? → Read 1-authorization.md Connecting to data/resources? → Read 2-app-resources.md Using Lakebase (PostgreSQL)? → Read 5-lakebase.md Deploying to Databricks? → Read 4-deployment.md Using MCP tools? → Read 6-mcp-approach.md Calling foundation model/LLM APIs? → See examples/llm_config.py
Follow the instructions in the relevant guide
For full code examples, browse https://apps-cookbook.dev/
All Python Databricks apps follow this pattern:
app-directory/
├── app.py # Main application (or framework-specific name)
├── models.py # Pydantic data models
├── backend.py # Data access layer
├── requirements.txt # Additional Python dependencies
├── app.yaml # Databricks Apps configuration
└── README.md
import os
from databricks.sdk.core import Config
USE_MOCK = os.getenv("USE_MOCK_BACKEND", "true").lower() == "true"
if USE_MOCK:
from backend_mock import MockBackend as Backend
else:
from backend_real import RealBackend as Backend
backend = Backend()
from databricks.sdk.core import Config
from databricks import sql
cfg = Config() # Auto-detects credentials from environment
conn = sql.connect(
server_hostname=cfg.host,
http_path=f"/sql/1.0/warehouses/{os.getenv('DATABRICKS_WAREHOUSE_ID')}",
credentials_provider=lambda: cfg.authenticate,
)
from pydantic import BaseModel, Field
from datetime import datetime
from enum import Enum
class Status(str, Enum):
ACTIVE = "active"
PENDING = "pending"
class EntityOut(BaseModel):
id: str
name: str
status: Status
created_at: datetime
class EntityIn(BaseModel):
name: str = Field(..., min_length=1)
status: Status = Status.PENDING
| Issue | Solution |
|---|---|
| Connection exhausted | Use @st.cache_resource (Streamlit) or connection pooling |
| Auth token not found | Check x-forwarded-access-token header — only available when deployed, not locally |
| App won't start | Check app.yaml command matches framework; check databricks apps logs <name> |
| Resource not accessible | Add resource via UI, verify SP has permissions, use valueFrom in app.yaml |
| Import error on deploy | Add missing packages to requirements.txt (pre-installed packages don't need listing) |
| Lakebase app crashes on start | psycopg2/asyncpg are NOT pre-installed — MUST add to requirements.txt |
| Port conflict | Apps must bind to DATABRICKS_APP_PORT env var (defaults to 8000). Never use 8080. Streamlit is auto-configured; for others, read the env var in code or use 8000 in app.yaml command |
| Streamlit: set_page_config error | st.set_page_config() must be the first Streamlit command |
| Dash: unstyled layout | Add dash-bootstrap-components; use dbc.themes.BOOTSTRAP |
| Slow queries | Use Lakebase for transactional/low-latency; SQL warehouse for analytical queries |
| Constraint | Details |
|---|---|
| Runtime | Python 3.11, Ubuntu 22.04 LTS |
| Compute | 2 vCPUs, 6 GB memory (default) |
| Pre-installed frameworks | Dash, Streamlit, Gradio, Flask, FastAPI, Shiny |
| Custom packages | Add to requirements.txt in app root |
| Network | Apps can reach Databricks APIs; external access depends on workspace config |
| User auth | Public Preview — workspace admin must enable before adding scopes |