Help us improve
Share bugs, ideas, or general feedback.
From tdder
Guides tradeoffs between command/event messaging, request-reply patterns, and failure handling for cross-boundary communication.
npx claudepluginhub t1/tdder --plugin tdderHow this skill is triggered — by the user, by Claude, or both
Slash command
/tdder:integration-architectureThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
How components communicate across boundaries: message style, data flow direction, and failure handling.
Maps event flows, designs topic topologies, schemas, and delivery guarantees for event-driven architectures using Kafka, RabbitMQ, Redis Streams, NATS, SQS. Supports event sourcing, CQRS, sagas.
Guides on message queues, pub/sub, event streaming, and async patterns for scalable distributed systems. Covers RabbitMQ, Kafka, Redis, SQS, BullMQ, Saga, Outbox, delivery guarantees, and best practices.
Design systems that communicate through events instead of direct service calls. Use when building loosely-coupled, scalable, and resilient architectures.
Share bugs, ideas, or general feedback.
How components communicate across boundaries: message style, data flow direction, and failure handling.
This skill complements Unfolding Architecture, which covers the decision of whether to use messaging at all (Communication dimension, Levels 0→1→2). Once you've decided to communicate across boundaries, this skill guides how.
Commands couple sender to receiver: the sender knows who to call and what to ask for. Events invert the dependency: receivers must know the sender's domain to interpret what happened. In more complex messaging patters, when several, interdependent messages are sent, the dependency is even stricter: the sender of commands must know about the state of the receiver; with events it's the other way around. Neither is strictly better.
Prefer Commands when:
Prefer Events when:
Key insight: Events don't remove coupling, they move it. The receiver now needs to understand the sender's context to react. This is worthwhile when senders change less often than the set of receivers.
When introducing cross-boundary communication, use AskUserQuestion with a code-based recommendation:
description
to explain why based on the code — e.g., "The method name orderCompleted and the lack of a
return value suggest a notification, not a directive" or "There is exactly one receiver
(PaymentService) that needs to confirm the action."A hybrid pattern: the sender sends a command and waits for a response, but communication is asynchronous (via a broker with a reply queue, not a direct method call). Useful when you need the decoupling benefits of messaging but the sender still requires a result.
Prefer Request-Reply when:
Prefer direct commands when:
Who initiates data flow.
Prefer Pull when:
Prefer Push when:
Key insight: Push shifts backpressure responsibility to the consumer. Pull is simpler but can miss real-time needs or waste resources on empty polls.
When a producer pushes faster than a consumer can process, work piles up. Backpressure is how the consumer signals the producer to slow down (or how the system absorbs the mismatch).
Common strategies:
Choose a strategy before going to production with push-based messaging. Unbounded queues are not a strategy — they defer the problem until memory runs out.
How communication handles failures. This dimension progresses under pressure, following the unfolding principle: start at Level 0, unfold only when concrete problems force the change.
| Level | Style | Description |
|---|---|---|
| 0 | Fire and forget | Send and assume success. No retries. Simple, appropriate when loss is acceptable or communication is in-process. |
| 1 | Retry + idempotency | Retry on failure; receivers are idempotent so duplicates are harmless. At-least-once delivery. Techniques: deduplication (message IDs), upserts, deterministic operations. |
| 2 | Transactional guarantees | Strongest guarantees, highest complexity. See techniques below. |
When choosing a reliability level, use AskUserQuestion with a code-based recommendation:
description
to explain why based on the code — e.g., "This is an in-process call; failure means process
crash anyway, so fire-and-forget is appropriate" or "This payment flow has business consequences
on message loss; retry + idempotency is the minimum."Unfold to Level 1 when:
Do NOT unfold to Level 1 when:
Unfold to Level 2 when:
Do NOT unfold to Level 2 when:
Transactional outbox: Instead of sending a message directly (which can fail independently of the local transaction), write the message to an outbox table in the same database transaction as the business data change. A separate process reads the outbox and publishes the messages. This guarantees that the local state change and the message are atomic — either both happen or neither does.
Saga: A sequence of local transactions across multiple services, each publishing an event or command that triggers the next step. If a step fails, compensating transactions undo the preceding steps. Sagas trade atomicity for availability — the system is eventually consistent, and you must design compensating actions for every step that can fail.
Integration code fails in ways that unit tests for business logic never exercise: brokers restart, messages arrive out of order, schemas drift between producer and consumer, compensation logic triggers under unexpected conditions. The scopes below are independent — pick the ones that match the risks of a given integration rather than treating them as a progression.
Note that it may be possible to run the same tests at various scopes, with only the test drivers / fixtures running in different configurations. This can reduce the cost for the maintaining tests, while it may also add complexity.
Test doubles (fakes, mocks, stubs) are not tied to a single scope. A unit test may fake a broker client; a system test may fake an external payment provider while using a real broker. Choose the double that isolates the failure you want to test.
Test message handling logic in isolation: serialization/deserialization, idempotency checks, compensation logic, routing decisions. All infrastructure is faked or mocked.
Good at catching: Logic errors in message handlers, incorrect deduplication, broken serialization for edge-case payloads, flawed compensation logic.
Example: A handler receives a duplicate message ID — does it skip processing? A compensation function receives a partial state — does it undo the right steps?
A single service running against technical real but local infrastructure (e.g., Testcontainers for broker and database) or fakes or mocks. The service processes messages end-to-end through its own stack, but no real remote service is involved.
Good at catching:
Producer and consumer agree on a message schema, verified independently of deployment. Each side runs its own tests against the shared contract. Neither side needs the other to be running.
Good at catching: Schema drift, breaking changes in field names or types, missing required fields, incompatible serialization formats. Especially valuable when different teams own producer and consumer.
Multiple services running together — some real, some faked. Exercises end-to-end flows across service boundaries.
Good at catching:
Start with unit tests for any non-trivial message handling logic — they're fast and cheap. Add integration scope when the service talks to a real broker or database, because those interactions are where most production failures hide. Add contract scope when producer and consumer are owned by different teams or deployed independently. Add system scope for critical flows where cross-service failure behavior matters (sagas, compensation chains, consistency guarantees).
You don't need all four for every integration. A fire-and-forget event with a single consumer may only need unit + integration. A saga across three services owned by two teams likely needs all four.
Making everything an event. When every action publishes an event and every component reacts to events, the system becomes impossible to trace. Commands exist for a reason — use them when the sender knows the receiver and needs confirmation.
Sending commands to every service for every operation, replicating the coupling of a monolith across the network. You get the worst of both worlds: distributed complexity with monolithic coupling. If every service change requires coordinated deployments, messaging has not helped.
Using Level 2 reliability (outbox, sagas) everywhere "just in case." Transactional guarantees are expensive in complexity and operational burden. Most communication tolerates at-least-once delivery (Level 1) or even fire-and-forget (Level 0). Match the reliability level to the actual business consequences of failure.
Designing components in isolation and deferring integration to the end. Each team builds their service with assumed message formats, assumed delivery semantics, and assumed ordering — then discovers at integration time that the assumptions don't match. The fix is expensive because contracts are baked into the internals.
Instead, integrate continuously from the start. Define message contracts early, even if the implementation behind them is trivial. Run integration tests against real (or realistic) brokers as soon as two components exist. Let integration pain surface while the design is still cheap to change.
This is also true for integrating with a (real) database, etc.
Using unbounded queues and assuming the consumer will keep up. This works until it doesn't, and then you lose messages or crash. Design for the producer being faster than the consumer from day one.