Guides production Spark Structured Streaming pipelines on Databricks: Kafka ingestion, triggers, watermarks, checkpoints, stream joins, multi-sink writes, merges, and performance tuning.
npx claudepluginhub databricks-solutions/ai-dev-kit --plugin databricks-ai-dev-kitThis skill uses the workspace's default tool permissions.
Production-ready streaming pipelines with Spark Structured Streaming. This skill provides navigation to detailed patterns and best practices.
Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.
Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.
Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.
Production-ready streaming pipelines with Spark Structured Streaming. This skill provides navigation to detailed patterns and best practices.
from pyspark.sql.functions import col, from_json
# Basic Kafka to Delta streaming
df = (spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "broker:9092")
.option("subscribe", "topic")
.load()
.select(from_json(col("value").cast("string"), schema).alias("data"))
.select("data.*")
)
df.writeStream \
.format("delta") \
.outputMode("append") \
.option("checkpointLocation", "/Volumes/catalog/checkpoints/stream") \
.trigger(processingTime="30 seconds") \
.start("/delta/target_table")
| Pattern | Description | Reference |
|---|---|---|
| Kafka Streaming | Kafka to Delta, Kafka to Kafka, Real-Time Mode | See kafka-streaming.md |
| Stream Joins | Stream-stream joins, stream-static joins | See stream-stream-joins.md, stream-static-joins.md |
| Multi-Sink Writes | Write to multiple tables, parallel merges | See multi-sink-writes.md |
| Merge Operations | MERGE performance, parallel merges, optimizations | See merge-operations.md |
| Topic | Description | Reference |
|---|---|---|
| Checkpoints | Checkpoint management and best practices | See checkpoint-best-practices.md |
| Stateful Operations | Watermarks, state stores, RocksDB configuration | See stateful-operations.md |
| Trigger & Cost | Trigger selection, cost optimization, RTM | See trigger-and-cost-optimization.md |
| Topic | Description | Reference |
|---|---|---|
| Production Checklist | Comprehensive best practices | See streaming-best-practices.md |