Stats

Actions

Tags

Help us improve

Share bugs, ideas, or general feedback.

stream-processing | systems-design | ClaudePluginHub

Skill

stream-processing

From systems-design

Guides real-time data processing pipelines, Kafka/Flink framework selection, event-driven architectures. Covers batch vs streaming, watermarks, windows, state management.

data-engineering

$

npx claudepluginhub melodic-software/claude-code-plugins --plugin systems-design

Popularity

Parent stars

67

Parent forks

10

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/systems-design:stream-processing

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadGlobGrep

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Patterns and technologies for real-time data processing, event streaming, and stream analytics.

SKILL.md

416 lines · ~2.2k tokens

Similar Skills

kafka-streams-programming

26

Architect, build, and debug Kafka Streams apps (JVM-embedded stream processing). Use when user mentions KStream, KTable, topology, TopologyTestDriver, StreamsBuilder, interactive queries, GlobalKTable, joins/windows/aggregations, or debugging issues (rebalancing, state stores, lag, deserialization errors). Do NOT trigger for Flink, connectors, CDC, or plain producer/consumer.

15 files

streaming-skills-plugin

role-database:streaming-databases

13

Guides operations for 14 streaming databases including Kafka (KRaft, Streams), Pulsar, Redpanda, Flink, Materialize for event streaming, CDC, real-time analytics, event sourcing.

6 files4 tools

databricks-spark-structured-streaming

1.5k

Guides production Spark Structured Streaming pipelines with Kafka ingestion, stream joins, watermarks, checkpoints, triggers, multi-sink writes, and cost tuning.

9 files

databricks-ai-dev-kit

Stats

LanguagePython

Parent stars67

Parent forks10

MaintenanceGood

Last CommitFeb 15, 2026

Actions

View Source View Plugin View on GitHub View README

Tags

stream-processing

real-time-pipelines

streaming-concepts

Help us improve

Share bugs, ideas, or general feedback.

Stream Processing

Patterns and technologies for real-time data processing, event streaming, and stream analytics.

When to Use This Skill

Designing real-time data pipelines
Choosing stream processing frameworks
Implementing event-driven architectures
Building real-time analytics
Understanding streaming vs batch trade-offs

Batch vs Streaming

Comparison

Aspect	Batch	Streaming
Latency	Minutes to hours	Milliseconds to seconds
Data	Bounded (finite)	Unbounded (infinite)
Processing	Process all at once	Process as it arrives
State	Recompute each run	Maintain continuously
Complexity	Lower	Higher
Cost	Often lower	Often higher

When to Use Streaming

Use streaming when:
- Real-time responses required (<1 minute)
- Events need immediate action (fraud, alerts)
- Data arrives continuously
- Users expect live updates
- Time-sensitive business decisions

Use batch when:
- Daily/hourly reports sufficient
- Complex transformations needed
- Cost optimization priority
- Historical analysis
- One-time processing

Stream Processing Concepts

Event Time vs Processing Time

Event Time: When event actually occurred
Processing Time: When event is processed

Example:
┌─────────────────────────────────────────────────────────┐
│ Event: Purchase at 10:00:00 (event time)                │
│ Network delay: 5 seconds                                │
│ Processing: 10:00:05 (processing time)                  │
└─────────────────────────────────────────────────────────┘

Why it matters:
- Late events need handling
- Ordering not guaranteed
- Watermarks track progress

Watermarks

Watermark = "All events before this time have arrived"

Event stream:
──[10:01]──[10:02]──[10:00]──[10:03]──[Watermark: 10:00]──

Allows system to:
- Know when window is complete
- Handle late events
- Balance latency vs completeness

Windows

Tumbling Window (fixed, non-overlapping):
|─────|─────|─────|
0     5    10    15 (seconds)

Sliding Window (fixed, overlapping):
|─────|
  |─────|
    |─────|
Size: 5s, Slide: 2s

Session Window (activity-based):
|──────|     |───────────|    |───|
User activity with gaps defines windows

Count Window:
Process every N events

State Management

Stateful operations require maintained state:
- Aggregations (sum, count, avg)
- Joins between streams
- Pattern detection
- Deduplication

State backends:
- In-memory (fast, limited)
- RocksDB (larger, persistent)
- External (Redis, database)

Stream Processing Frameworks

Apache Kafka Streams

Characteristics:
- Library (not a cluster)
- Exactly-once semantics
- Kafka-native
- Java/Scala

Best for:
- Kafka-centric architectures
- Simpler transformations
- Microservices

Example topology:
source → filter → map → aggregate → sink

Apache Flink

Characteristics:
- Distributed cluster
- True streaming (not micro-batch)
- Advanced state management
- SQL support

Best for:
- Complex event processing
- Large-scale streaming
- Low-latency requirements

Example:
DataStream<Event> events = env.addSource(kafkaSource);
events
    .keyBy(e -> e.getUserId())
    .window(TumblingEventTimeWindows.of(Time.minutes(5)))
    .aggregate(new CountAggregator())
    .addSink(sink);

Apache Spark Streaming

Characteristics:
- Micro-batch processing
- Unified batch + streaming API
- Wide ecosystem
- Python, Scala, Java, R

Best for:
- Teams with Spark experience
- Batch + streaming unified
- Machine learning integration

Latency: Seconds (micro-batch)

Kafka Streams vs Flink vs Spark

Factor	Kafka Streams	Flink	Spark Streaming
Deployment	Library	Cluster	Cluster
Latency	Low	Lowest	Medium
State	Good	Excellent	Good
Exactly-once	Yes	Yes	Yes
Complexity	Low	High	Medium
Scaling	With Kafka	Independent	Independent
SQL	Limited	Yes	Yes
ML integration	Limited	Limited	Excellent

Stream Processing Patterns

Filtering

Input: All events
Output: Events matching criteria

Example: Only process events where amount > 1000

Mapping/Transformation

Input: Event type A
Output: Event type B

Example: Enrich order events with customer data

Aggregation

Input: Multiple events
Output: Single aggregated result

Examples:
- Count events per window
- Sum amounts per user
- Average latency per endpoint

Join Patterns

Stream-Stream Join:
┌─────────────┐     ┌─────────────┐
│   Orders    │ ──► │    Join     │
└─────────────┘     │ (by order_id│
┌─────────────┐     │  in window) │
│  Shipments  │ ──► │             │
└─────────────┘     └─────────────┘

Stream-Table Join (Enrichment):
┌─────────────┐     ┌─────────────┐
│   Events    │ ──► │    Join     │
└─────────────┘     │ (lookup by  │
┌─────────────┐     │  customer)  │
│  Customer   │ ──► │             │
│   Table     │     └─────────────┘
└─────────────┘

Deduplication

Problem: Duplicate events from at-least-once delivery

Solution:
1. Track seen IDs in state (with TTL)
2. If seen, drop
3. If new, process and store ID

State: {event_id: timestamp}
TTL: Based on expected duplicate window

Event Delivery Guarantees

At-Most-Once

May lose events, never duplicates
Process → Commit → (if fail, event lost)

Use when: Loss acceptable, simplicity preferred

At-Least-Once

Never loses, may have duplicates
Commit → Process → (if fail, reprocess)

Use when: No loss acceptable, handle duplicates downstream

Exactly-Once

Never loses, never duplicates
Requires:
- Idempotent operations, OR
- Transactional processing

How it works:
1. Read from source transactionally
2. Process and update state
3. Write output and commit together

Flink: Checkpointing + two-phase commit
Kafka Streams: Transactional producer + EOS

Late Event Handling

Strategies

1. Drop late events
   Simple, may lose data

2. Allow late events (allowed lateness)
   Process if within lateness threshold

3. Side output late events
   Main stream processes on-time
   Side stream handles late separately

4. Reprocess historical
   Batch job fixes late data impact

Watermark Strategies

Bounded Out-of-Orderness:
watermark = max_event_time - max_lateness

Example:
max_event_time = 10:00:00
max_lateness = 5 seconds
watermark = 09:59:55

Events before 09:59:55 considered complete

Scalability Patterns

Partitioning

Partition by key for parallel processing:

┌─────────────────────────────────────────────────────┐
│ Kafka Topic (3 partitions)                          │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐│
│ │ Partition 0 │ │ Partition 1 │ │ Partition 2     ││
│ │ user_a, b   │ │ user_c, d   │ │ user_e, f       ││
│ └─────────────┘ └─────────────┘ └─────────────────┘│
└─────────────────────────────────────────────────────┘
         │               │               │
         ▼               ▼               ▼
    ┌─────────┐    ┌─────────┐    ┌─────────┐
    │Worker 0 │    │Worker 1 │    │Worker 2 │
    └─────────┘    └─────────┘    └─────────┘

Backpressure

When downstream can't keep up:

1. Buffer (risk: OOM)
2. Drop (risk: data loss)
3. Backpressure (slow down source)

Flink: Backpressure propagates automatically
Kafka: Consumer lag indicates backpressure

Monitoring Streaming Applications

Key Metrics

Throughput:
- Events per second
- Bytes per second

Latency:
- Processing latency
- End-to-end latency

Health:
- Consumer lag
- Checkpoint duration
- Backpressure rate
- Error rate

Consumer Lag

Lag = Latest offset - Consumer offset

High lag indicates:
- Processing too slow
- Need more parallelism
- Downstream bottleneck

Monitor: Set alerting thresholds

Best Practices

1. Design for exactly-once when needed
2. Handle late events explicitly
3. Use event time, not processing time
4. Monitor consumer lag closely
5. Plan for state recovery
6. Test with realistic data volumes
7. Implement backpressure handling
8. Keep processing idempotent when possible

Related Skills

message-queues - Messaging patterns
data-architecture - Data platform design
etl-elt-patterns - Data pipeline patterns