Event-driven architecture — event catalog, schema registry, eventual consistency, saga, CQRS, event sourcing. Use when the user asks to "design event-driven system", "build event catalog", "implement CQRS", "design saga patterns", "set up schema registry", "implement event sourcing", or mentions Kafka, RabbitMQ, Pulsar, event bus, dead-letter queue, consumer groups, or event replay.
From maonpx claudepluginhub javimontano/mao-discovery-frameworkThis skill is limited to using the following tools:
examples/README.mdexamples/sample-output.htmlexamples/sample-output.mdprompts/metaprompts.mdprompts/use-case-prompts.mdreferences/body-of-knowledge.mdreferences/event-patterns.mdreferences/knowledge-graph.mmdreferences/state-of-the-art.mdEnables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Event-driven architecture decouples producers from consumers through asynchronous messaging — enabling scalability, resilience, and temporal flexibility. The skill covers event catalog design, broker selection, schema governance, consistency patterns (sagas, CQRS, event sourcing), and the operational practices that keep event systems reliable.
Los eventos son hechos inmutables — no mensajes descartables. Un evento publicado es historia del sistema. El catálogo de eventos es el system of record, el schema registry previene breaking changes, y la consistencia eventual es una feature, no un bug.
The user provides a system or platform name as $ARGUMENTS. Parse $1 as the system/platform name used throughout all output artifacts.
Parameters:
{MODO}: piloto-auto (default) | desatendido | supervisado | paso-a-paso
{FORMATO}: markdown (default) | html | dual{VARIANTE}: ejecutiva (~40% — S1 catalog + S3 schema registry + S4 consistency) | técnica (full 6 sections, default)Before generating event architecture, detect the codebase context:
!find . -name "*.yaml" -o -name "*.json" -o -name "*.avro" -o -name "*.proto" -o -name "*event*" -o -name "*kafka*" -o -name "*rabbit*" | head -30
Use detected event definitions, broker configurations, and schema files to tailor catalog structure, pattern recommendations, and operational guidance.
If reference materials exist, load them:
Read ${CLAUDE_SKILL_DIR}/references/event-patterns.md
Establish naming conventions, event types, and a discoverable catalog of all events.
Event naming: <Domain>.<Entity>.<Action> (e.g., Order.Payment.Completed)
Event type classification:
CloudEvents Standard (CNCF) — vendor-neutral event envelope for interoperability:
id, source, specversion (1.0), typetime, datacontenttype, dataschema, subjectEvent Granularity Decision Matrix:
| Type | Payload | Coupling | Latency | When to use |
|---|---|---|---|---|
| Notification (thin) | Signal only: { orderId } | Low (consumer fetches via API) | Higher (API callback) | Default starting point; consumer has API access |
| State Transfer (fat) | Full state: { orderId, items[], total } | Higher (schema dependency) | Lower (self-contained) | Consumer needs embedded data; API callback adds unacceptable latency |
| Delta | Changed fields only: { orderId, status: "shipped" } | Medium | Lowest | Consumer maintains local state; bandwidth-constrained |
Rule of thumb: start thin, fatten only when consumers demonstrably need embedded data.
Key decisions:
Select and configure the message broker for reliability, throughput, and operational simplicity.
Broker Selection Matrix:
| Criterion | Apache Kafka | RabbitMQ | Apache Pulsar | Cloud-native (SNS+SQS, Event Grid, Pub/Sub) |
|---|---|---|---|---|
| Throughput | Millions msg/sec | Tens of thousands | Millions msg/sec | Varies by service |
| Replay | Native (log-based) | Not built-in | Native (tiered storage) | Limited |
| Latency | Low-medium (batching) | Sub-millisecond | Low | Medium |
| Ordering | Per-partition | Per-queue | Per-partition | Varies |
| Multi-tenancy | Topic-level | Vhost-level | Native | Native |
| Ops complexity | High (ZK/KRaft) | Low-medium | Medium-high | Managed |
| Best for | High-volume, event sourcing | Task queues, RPC, simple routing | Multi-tenant, geo-replicated | Serverless, low ops budget |
Critical Kafka Configurations for Reliability:
acks=all — wait for all in-sync replicas to acknowledge (mandatory for durability)min.insync.replicas=2 — require at least 2 replicas in sync before accepting writesenable.idempotence=true — prevent duplicate messages from producer retriesmax.in.flight.requests.per.connection=5 — safe with idempotence enabledConsumer Group Strategies:
partition.assignment.strategy=cooperative-sticky) to minimize partition shuffling during scalingauto.offset.reset=earliest for isolated replayPartitioning: By entity ID (ordering guarantee), by tenant (isolation), round-robin (max throughput) Retention: Time-based (7-30 days typical) or log compaction for latest-state topics
Govern schema evolution to prevent producer-consumer contract breaks.
Platforms: Confluent Schema Registry, AWS Glue Schema Registry, Apicurio Formats: Avro (compact, best Kafka integration), Protobuf (strong typing, gRPC bridge), JSON Schema (readable, flexible)
Compatibility Modes:
| Mode | Rule | Safe changes | Use when |
|---|---|---|---|
| Backward (recommended default) | New schema reads old data | Add optional fields, remove fields with defaults | Consumers upgrade before producers |
| Forward | Old schema reads new data | Remove optional fields, add fields with defaults | Producers upgrade before consumers |
| Full | Both directions | Only add/remove optional fields with defaults | Maximum safety, most restrictive |
| None | No checks | Anything | Never in production |
CI/CD integration: Block deployments that break schema compatibility. Run schema validation on every PR that modifies event definitions.
Manage distributed consistency without distributed transactions.
Saga Pattern Comparison:
| Aspect | Orchestration | Choreography |
|---|---|---|
| Coordination | Central orchestrator | Decentralized, event-driven |
| Visibility | Clear flow, centralized state | Emergent, hard to trace |
| Coupling | Orchestrator depends on all services | Services loosely coupled |
| Error handling | Centralized compensation logic | Distributed, each service handles own |
| Best for | Complex multi-step (4+ services), financial | Simple 2-3 step workflows |
Outbox Pattern for Reliable Publishing:
outbox table in one DB transaction (atomicity guaranteed)Outbox table schema: id, aggregate_type, aggregate_id, event_type, payload, created_at, published_at
Relay Options:
| Method | Latency | Complexity | When to use |
|---|---|---|---|
| Polling | Higher (poll interval) | Low (simple query) | Small-medium volume, ops simplicity |
| CDC (Debezium) | Near-real-time | Higher (Kafka Connect, connector config) | High volume, low-latency requirement |
Debezium reads the database WAL/binlog and streams outbox rows to Kafka. Use the outbox.event.router SMT to transform CDC records into clean business events.
Inbox pattern: Consumer writes received event to inbox table, deduplicates by event ID, processes idempotently.
Idempotency: Every consumer must safely process the same event twice. Use idempotency keys stored in a deduplication table with TTL.
Separate read and write models; optionally store state as a sequence of events.
CQRS:
Event Sourcing:
Decision criteria:
Ensure event systems are reliable, observable, and recoverable in production.
Dead-Letter Topic (DLT) Management:
Poison Pill Detection:
Consumer Lag Monitoring:
Event Replay:
Observability: Distributed tracing with correlationId through entire event chain. Metrics: producer rate, consumer rate, lag, DLT depth, processing duration histograms.
| Decision | Enables | Constrains | When to Use |
|---|---|---|---|
| Kafka | High throughput, replay, persistence | Ops complexity, partition management | High-volume, event sourcing, log-based |
| RabbitMQ | Flexible routing, low latency, simpler ops | No replay, limited persistence | Task queues, RPC, moderate volume |
| Orchestrated Saga | Clear flow, centralized error handling | Coordinator coupling | Complex multi-step, financial transactions |
| Choreographed Saga | Loose coupling, independent deployment | Hard to trace, debug | Simple 2-3 service workflows |
| Event Sourcing | Full audit, temporal queries, replay | Complexity, storage growth, schema evolution | Financial, compliance, audit-critical domains |
| CQRS without ES | Read/write optimization, simpler | Projection sync, eventual consistency | Reporting-heavy, different read/write patterns |
| Outbox Pattern | Reliable publishing, transactional guarantee | Additional table, relay infrastructure | Any event system needing reliability |
| Caso | Estrategia de Manejo |
|---|---|
| Migracion de comunicacion sincrona a event-driven en sistema en produccion | Strangler fig pattern. Identificar boundaries de mayor valor asincrono primero (long-running processes, fan-out). Dual-write sync+async durante transicion. Validar con shadow traffic antes de cutover. |
| Schema evolution con consumidores en versiones diferentes (N, N-1, N-2) | Schema registry con backward compatibility obligatorio. Deploy consumidores antes que productores cuando se agregan campos required. Upcasting para transformar eventos legacy durante replay. Max 2 versiones en paralelo. |
| Event storm de alto volumen (>100K msgs/sec burst) que desborda consumidores | Backpressure via consumer throttling. Auto-scaling de instancias por consumer lag >1000 msgs. Circuit breaker para poison pills. DLT previene que un evento malo bloquee el stream. Pre-provision para picos conocidos. |
| Multi-region con latencia de replicacion cross-region >200ms | Definir eventos globales vs regionales. Eventos globales: replicacion async con CRDTs o last-writer-wins. Eventos regionales: procesamiento local sin dependencia cross-region. Conflict resolution explicito. |
| Decision | Alternativa Descartada | Justificacion |
|---|---|---|
| Outbox pattern con CDC (Debezium) sobre publicacion directa al broker | Publish al broker dentro de la transaccion de negocio | Publicacion directa no es atomica: si el broker falla post-commit, se pierde el evento. Outbox + CDC garantiza exactly-once semantics a nivel de negocio. Complejidad adicional justificada por confiabilidad. |
| Backward compatibility como default en schema registry sobre full compatibility | Full compatibility (mas restrictivo) | Full compatibility bloquea cambios validos como eliminar campos opcionales. Backward permite evolucion controlada mientras consumidores mantienen compatibilidad. Full solo para dominios criticos (pagos, compliance). |
| Orquestacion de sagas para workflows >4 servicios sobre coreografia | Coreografia descentralizada | Coreografia en workflows complejos genera flows emergentes imposibles de rastrear. Orquestacion centraliza visibilidad del estado, simplifica error handling y compensacion. Coreografia solo para 2-3 servicios simples. |
graph TD
subgraph Core
EVT[event-architecture]
end
subgraph Inputs
DOM[Domain Events from Business] --> EVT
INT[Integration Requirements] --> EVT
VOL[Volume & Latency SLAs] --> EVT
end
subgraph Outputs
EVT --> CAT[Event Catalog & Taxonomy]
EVT --> BRK[Broker Architecture & Config]
EVT --> SCH[Schema Registry Design]
EVT --> SAG[Saga & Consistency Patterns]
EVT --> OPS[Operational Runbook]
end
subgraph Related Skills
EVT -.-> SA[software-architecture]
EVT -.-> API[api-architecture]
EVT -.-> DE[data-engineering]
EVT -.-> OBS[observability]
end
Formato MD (default):
# Event Architecture: {system_name}
## S1: Event Catalog & Taxonomy
- Event listing (Domain.Entity.Action)
- CloudEvents envelope spec
- Granularity decisions per event
## S2: Broker Architecture
- Selection matrix, configs, partitioning
## S3-S6: [remaining sections]
## Anexos: AsyncAPI specs, DLT handling procedures, consumer lag dashboards
Formato HTML (secondary):
A-01_Event_Architecture_{cliente}_{WIP}.htmlFormato DOCX (bajo demanda):
A-01_Event_Architecture_{cliente}_{WIP}.docxFormato XLSX (bajo demanda):
{fase}_{entregable}_{cliente}_{WIP}.xlsxFormato PPTX (bajo demanda):
{fase}_{entregable}_{cliente}_{WIP}.pptx| Dimension | Peso | Criterio | Umbral Minimo |
|---|---|---|---|
| Trigger Accuracy | 10% | El skill se activa correctamente ante menciones de event-driven, Kafka, CQRS, saga, schema registry, event sourcing | 7/10 |
| Completeness | 25% | Las 6 secciones cubren catalogo, broker, schema, consistency, CQRS/ES, y operaciones | 7/10 |
| Clarity | 20% | Cada evento catalogado con schema, naming, granularity. Broker configs con valores concretos. Sagas con flujo explicito. | 7/10 |
| Robustness | 20% | Edge cases de migracion, schema evolution, event storms cubiertos. DLT con categorization y replay. Poison pill detection. | 7/10 |
| Efficiency | 10% | Output proporcional al contexto. Sin documentar patterns no relevantes al sistema. Catalogo scoped al dominio. | 7/10 |
| Value Density | 15% | Configs de broker listos para produccion. AsyncAPI specs generables. Operational runbook con thresholds concretos. | 7/10 |
Umbral minimo global: 7/10. Deliverables por debajo requieren re-work antes de entrega.
Migrating from Synchronous to Event-Driven: Strangler fig pattern. Identify highest-value async boundaries first (long-running processes, fan-out notifications). Run sync and async in parallel during transition.
Schema Evolution in Production: Consumers at different versions. Backward-compatible changes only. Deploy consumers before producers when adding required fields. Schema registry enforces compatibility.
Event Ordering Across Partitions: Global ordering is expensive. Most systems need per-entity ordering (same partition key). If cross-entity ordering matters, single partition (throughput trade-off) or timestamp-based reconciliation.
High-Volume Event Storms: Burst traffic overwhelms consumers. Use backpressure, consumer auto-scaling, circuit breakers. DLT prevents one bad event from blocking the stream.
Multi-Region: Cross-region replication adds latency. Define global vs. regional events. Use CRDTs or last-writer-wins for cross-region consistency.
Before finalizing delivery, verify:
Domain.Entity.Action)| Format | Default | Description |
|---|---|---|
markdown | ✅ | Rich Markdown + Mermaid diagrams. Token-efficient. |
html | On demand | Branded HTML (Design System). Visual impact. |
dual | On demand | Both formats. |
Default output is Markdown with embedded Mermaid diagrams. HTML generation requires explicit {FORMATO}=html parameter.
Primary: A-01_Event_Architecture.html — Executive summary, event catalog, broker architecture, schema registry design, consistency patterns, CQRS/event sourcing design, operational runbook.
Secondary: AsyncAPI specifications, event schema definitions, saga flow diagrams, DLT handling procedures, consumer lag dashboard configuration.
Autor: Javier Montaño | Última actualización: 12 de marzo de 2026