distributed-systems
Distributed systems patterns for locking, resilience, idempotency, and rate limiting. Use when implementing distributed locks, circuit breakers, retry policies, idempotency keys, token bucket rate limiters, or fault tolerance patterns.
From orknpx claudepluginhub yonatangross/orchestkit --plugin orkThis skill is limited to using the following tools:
checklists/circuit-breaker-setup.mdchecklists/distributed-locks-checklist.mdchecklists/idempotency-checklist.mdchecklists/pre-deployment-resilience.mdchecklists/rate-limiting-checklist.mdexamples/fastapi-rate-limiting.mdexamples/idempotency-examples.mdexamples/orchestkit-workflow-resilience.mdmetadata.jsonreferences/bulkhead-pattern.mdreferences/circuit-breaker.mdreferences/error-classification.mdreferences/llm-resilience.mdreferences/postgres-advisory-locks.mdreferences/redis-locks.mdreferences/redlock-algorithm.mdreferences/retry-strategies.mdreferences/stripe-pattern.mdreferences/token-bucket-algorithm.mdrules/_sections.mdDistributed Systems Patterns
Comprehensive patterns for building reliable distributed systems. Each category has individual rule files in rules/ loaded on-demand.
Quick Reference
| Category | Rules | Impact | When to Use |
|---|---|---|---|
| Distributed Locks | 3 | CRITICAL | Redis/Redlock locks, PostgreSQL advisory locks, fencing tokens |
| Resilience | 3 | CRITICAL | Circuit breakers, retry with backoff, bulkhead isolation |
| Idempotency | 3 | HIGH | Idempotency keys, request dedup, database-backed idempotency |
| Rate Limiting | 3 | HIGH | Token bucket, sliding window, distributed rate limits |
| Edge Computing | 2 | HIGH | Edge workers, V8 isolates, CDN caching, geo-routing |
| Event-Driven | 2 | HIGH | Event sourcing, CQRS, transactional outbox, sagas |
Total: 16 rules across 6 categories
Quick Start
# Redis distributed lock with Lua scripts
async with RedisLock(redis_client, "payment:order-123"):
await process_payment(order_id)
# Circuit breaker for external APIs
@circuit_breaker(failure_threshold=5, recovery_timeout=30)
@retry(max_attempts=3, base_delay=1.0)
async def call_external_api():
...
# Idempotent API endpoint
@router.post("/payments")
async def create_payment(
data: PaymentCreate,
idempotency_key: str = Header(..., alias="Idempotency-Key"),
):
return await idempotent_execute(db, idempotency_key, "/payments", process)
# Token bucket rate limiting
limiter = TokenBucketLimiter(redis_client, capacity=100, refill_rate=10)
if await limiter.is_allowed(f"user:{user_id}"):
await handle_request()
Distributed Locks
Coordinate exclusive access to resources across multiple service instances.
| Rule | File | Key Pattern |
|---|---|---|
| Redis & Redlock | ${CLAUDE_SKILL_DIR}/rules/locks-redis-redlock.md | Lua scripts, SET NX, multi-node quorum |
| PostgreSQL Advisory | ${CLAUDE_SKILL_DIR}/rules/locks-postgres-advisory.md | Session/transaction locks, lock ID strategies |
| Fencing Tokens | ${CLAUDE_SKILL_DIR}/rules/locks-fencing-tokens.md | Owner validation, TTL, heartbeat extension |
Resilience
Production-grade fault tolerance for distributed systems.
| Rule | File | Key Pattern |
|---|---|---|
| Circuit Breaker | ${CLAUDE_SKILL_DIR}/rules/resilience-circuit-breaker.md | CLOSED/OPEN/HALF_OPEN states, sliding window |
| Retry & Backoff | ${CLAUDE_SKILL_DIR}/rules/resilience-retry-backoff.md | Exponential backoff, jitter, error classification |
| Bulkhead Isolation | ${CLAUDE_SKILL_DIR}/rules/resilience-bulkhead.md | Semaphore tiers, rejection policies, queue depth |
Idempotency
Ensure operations can be safely retried without unintended side effects.
| Rule | File | Key Pattern |
|---|---|---|
| Idempotency Keys | ${CLAUDE_SKILL_DIR}/rules/idempotency-keys.md | Deterministic hashing, Stripe-style headers |
| Request Dedup | ${CLAUDE_SKILL_DIR}/rules/idempotency-dedup.md | Event consumer dedup, Redis + DB dual layer |
| Database-Backed | ${CLAUDE_SKILL_DIR}/rules/idempotency-database.md | Unique constraints, upsert, TTL cleanup |
Rate Limiting
Protect APIs with distributed rate limiting using Redis.
| Rule | File | Key Pattern |
|---|---|---|
| Token Bucket | ${CLAUDE_SKILL_DIR}/rules/ratelimit-token-bucket.md | Redis Lua scripts, burst capacity, refill rate |
| Sliding Window | ${CLAUDE_SKILL_DIR}/rules/ratelimit-sliding-window.md | Sorted sets, precise counting, no boundary spikes |
| Distributed Limits | ${CLAUDE_SKILL_DIR}/rules/ratelimit-distributed.md | SlowAPI + Redis, tiered limits, response headers |
Edge Computing
Edge runtime patterns for Cloudflare Workers, Vercel Edge, and Deno Deploy.
| Rule | File | Key Pattern |
|---|---|---|
| Edge Workers | ${CLAUDE_SKILL_DIR}/rules/edge-workers.md | V8 isolate constraints, Web APIs, geo-routing, auth at edge |
| Edge Caching | ${CLAUDE_SKILL_DIR}/rules/edge-caching.md | Cache-aside at edge, CDN headers, KV storage, stale-while-revalidate |
Event-Driven
Event sourcing, CQRS, saga orchestration, and reliable messaging patterns.
| Rule | File | Key Pattern |
|---|---|---|
| Event Sourcing | ${CLAUDE_SKILL_DIR}/rules/event-sourcing.md | Event-sourced aggregates, CQRS read models, optimistic concurrency |
| Event Messaging | ${CLAUDE_SKILL_DIR}/rules/event-messaging.md | Transactional outbox, saga compensation, idempotent consumers |
Key Decisions
| Decision | Recommendation |
|---|---|
| Lock backend | Redis for speed, PostgreSQL if already using it, Redlock for HA |
| Lock TTL | 2-3x expected operation time |
| Circuit breaker recovery | Half-open probe with sliding window |
| Retry algorithm | Exponential backoff + full jitter |
| Bulkhead isolation | Semaphore-based tiers (Critical/Standard/Optional) |
| Idempotency storage | Redis (speed) + DB (durability), 24-72h TTL |
| Rate limit algorithm | Token bucket for most APIs, sliding window for strict quotas |
| Rate limit storage | Redis (distributed, atomic Lua scripts) |
When NOT to Use
No separate event-sourcing/saga/CQRS skills exist — they are rules within distributed-systems. But most projects never need them.
| Pattern | Interview | Hackathon | MVP | Growth | Enterprise | Simpler Alternative |
|---|---|---|---|---|---|---|
| Event sourcing | OVERKILL | OVERKILL | OVERKILL | OVERKILL | WHEN JUSTIFIED | Append-only table with status column |
| Saga orchestration | OVERKILL | OVERKILL | OVERKILL | SELECTIVE | APPROPRIATE | Sequential service calls with manual rollback |
| Circuit breaker | OVERKILL | OVERKILL | BORDERLINE | APPROPRIATE | REQUIRED | Try/except with timeout |
| Distributed locks | OVERKILL | OVERKILL | BORDERLINE | APPROPRIATE | REQUIRED | Database row-level lock (SELECT FOR UPDATE) |
| CQRS | OVERKILL | OVERKILL | OVERKILL | OVERKILL | WHEN JUSTIFIED | Single model for read/write |
| Transactional outbox | OVERKILL | OVERKILL | OVERKILL | SELECTIVE | APPROPRIATE | Direct publish after commit |
| Rate limiting | OVERKILL | OVERKILL | SIMPLE ONLY | APPROPRIATE | REQUIRED | Nginx rate limit or cloud WAF |
Rule of thumb: If you have a single server process, you do not need distributed systems patterns. Use in-process alternatives. Add distribution only when you actually have multiple instances.
Anti-Patterns (FORBIDDEN)
# LOCKS: Never forget TTL (causes deadlocks)
await redis.set(f"lock:{name}", "1") # WRONG - no expiry!
# LOCKS: Never release without owner check
await redis.delete(f"lock:{name}") # WRONG - might release others' lock
# RESILIENCE: Never retry non-retryable errors
@retry(max_attempts=5, retryable_exceptions={Exception}) # Retries 401!
# RESILIENCE: Never put retry outside circuit breaker
@retry # Would retry when circuit is open!
@circuit_breaker
async def call(): ...
# IDEMPOTENCY: Never use non-deterministic keys
key = str(uuid.uuid4()) # Different every time!
# IDEMPOTENCY: Never cache error responses
if response.status_code >= 400:
await cache_response(key, response) # Errors should retry!
# RATE LIMITING: Never use in-memory counters in distributed systems
request_counts = {} # Lost on restart, not shared across instances
Detailed Documentation
| Resource | Description |
|---|---|
${CLAUDE_SKILL_DIR}/scripts/ | Templates: lock implementations, circuit breaker, rate limiter |
${CLAUDE_SKILL_DIR}/checklists/ | Pre-flight checklists for each pattern category |
${CLAUDE_SKILL_DIR}/references/ | Deep dives: Redlock algorithm, bulkhead tiers, token bucket |
${CLAUDE_SKILL_DIR}/examples/ | Complete integration examples |
Related Skills
caching- Redis caching patterns, cache as fallbackbackground-jobs- Job deduplication, async processing with retryobservability-monitoring- Metrics and alerting for circuit breaker state changeserror-handling-rfc9457- Structured error responses for resilience failuresauth-patterns- API key management, authentication integration