Systems design patterns for scalability, distributed systems, and quality attributes. Covers interview preparation, production architecture, and conceptual fundamentals. Language-agnostic with .NET examples where appropriate.
npx claudepluginhub melodic-software/claude-code-plugins --plugin systems-designReview API designs for best practices, consistency, and common issues. PROACTIVELY use when reviewing OpenAPI specs, API endpoints, or REST/GraphQL/gRPC designs.
PROACTIVELY use when estimating system capacity, calculating QPS/storage/bandwidth, or sizing infrastructure. Provides guided back-of-envelope calculations with step-by-step breakdowns.
PROACTIVELY use when designing chaos experiments, planning GameDays, or improving system resilience. Helps identify failure modes, design fault injection experiments, and validate resilience patterns like circuit breakers, retries, and bulkheads.
Design data pipelines, recommend data architecture patterns, and analyze data flow requirements. PROACTIVELY use when designing ETL/ELT pipelines, data lakes, or streaming architectures.
PROACTIVELY use when designing CDN strategies, edge deployment architectures, or optimizing global latency. Helps design content delivery, edge compute placement, multi-region deployment, and geographic routing strategies.
PROACTIVELY use when optimizing LLM serving latency, reducing inference costs, or improving throughput. Provides quick recommendations for LLM performance optimization.
PROACTIVELY use for ML system design interview practice. Simulates a senior interviewer conducting ML-focused system design interviews with realistic follow-ups and feedback.
PROACTIVELY use when designing end-to-end ML systems, feature stores, training pipelines, or model serving infrastructure. Provides architectural guidance for production ML systems.
PROACTIVELY use when defining SLOs, designing monitoring strategies, or implementing observability. Helps design comprehensive observability approaches including SLI selection, SLO targets, error budgets, alerting strategies, and the three pillars (logs, metrics, traces).
PROACTIVELY use when designing Internal Developer Platforms, building platform teams, or improving developer experience. Helps design platform architecture, self-service capabilities, golden paths, and infrastructure provisioning systems.
PROACTIVELY use when designing RAG systems, choosing embedding strategies, optimizing retrieval quality, or building knowledge-grounded LLM applications. Provides architectural guidance for RAG pipelines.
PROACTIVELY use when reviewing architecture for security gaps, performing zero trust assessments, or evaluating security posture. Analyzes designs for security vulnerabilities, authentication/authorization gaps, data protection issues, and provides remediation guidance.
PROACTIVELY use for senior/staff+ level system design interview practice. Simulates a principal engineer or director conducting rigorous interviews with deep probing, pushback on assumptions, and high-bar evaluation.
Use when designing APIs, choosing between REST/GraphQL/gRPC, or understanding API design best practices. Covers protocol selection, resource modeling, and API patterns.
Review API design for best practices, consistency, and common issues
Use when implementing API authentication, authorization, or security patterns. Covers OAuth 2.0, OIDC, JWT, API keys, rate limiting, and common API security vulnerabilities.
Use when planning API versioning strategy, handling breaking changes, or managing API deprecation. Covers URL, header, and query parameter versioning approaches.
Use when designing content delivery networks, caching strategies, or global content distribution. Covers CDN architecture, cache hierarchies, origin shielding, cache invalidation, and edge optimization.
Use when implementing chaos engineering, designing fault injection experiments, or building resilience testing practices. Covers chaos principles and experiment design.
Design chaos engineering experiments for a system - identifies failure modes, creates experiment hypotheses, and generates GameDay plans
Use when designing data platforms, choosing between data lakes/lakehouses/warehouses, or implementing data mesh patterns. Covers modern data architecture approaches.
Design data pipeline architecture for a given data flow scenario
Use when designing data models, database schemas, or choosing between modeling approaches. Covers dimensional modeling, star schema, data vault, entity-relationship design, and schema evolution.
4-step framework for system design interviews. Use when preparing for technical interviews, practicing whiteboard design, or structuring architectural discussions. Covers requirements gathering, high-level design, deep dives, and wrap-up.
Use when implementing distributed tracing, understanding trace propagation, or debugging cross-service issues. Covers OpenTelemetry, span context, and trace correlation.
Use when designing edge computing architectures, serverless at edge, or distributed compute strategies. Covers edge functions, compute placement decisions, Cloudflare Workers, Lambda@Edge, and edge-native patterns.
Design CDN and edge deployment strategy for global distribution - optimizes latency, plans caching architecture, and recommends edge compute placement
Back-of-envelope calculations for system design. Use when estimating QPS, storage, bandwidth, or latency for capacity planning. Includes latency numbers every programmer should know and common estimation patterns.
Use when designing data pipelines, choosing between ETL and ELT approaches, or implementing data transformation patterns. Covers modern data pipeline architecture.
Explain a systems design concept
Use when planning GameDay exercises, designing failure scenarios, or conducting chaos drills. Covers GameDay preparation, execution, and follow-up.
Use when designing standardized development workflows, paved roads, or opinionated defaults. Covers golden path patterns, template design, developer workflow optimization, and guardrails.
Use when designing idempotent APIs, handling retries safely, or preventing duplicate operations. Covers idempotency keys, at-most-once semantics, and duplicate prevention.
Use when designing incident management processes, creating runbooks, or establishing on-call practices. Covers incident lifecycle, communication, and postmortems.
Plan instrumentation strategy before implementation, covering what to instrument, naming conventions, cardinality management, and instrumentation budget
Use when designing Internal Developer Platforms (IDPs), building platform teams, or improving developer experience. Covers platform engineering principles, Backstage, portal design, and platform team structures.
Calculate and allocate latency budgets for a system - breaks down end-to-end latency into component budgets with optimization recommendations
Use when optimizing end-to-end latency, reducing response times, or improving performance for latency-sensitive applications. Covers latency budgets, geographic routing, protocol optimization, and latency measurement techniques.
LLM inference infrastructure, serving frameworks (vLLM, TGI, TensorRT-LLM), quantization techniques, batching strategies, and streaming response patterns. Use when designing LLM serving infrastructure, optimizing inference latency, or scaling LLM deployments.
ML inference latency optimization, model compression, distillation, caching strategies, and edge deployment patterns. Use when optimizing inference performance, reducing model size, or deploying ML at the edge.
Design an ML system for a problem
End-to-end ML system design for production. Use when designing ML pipelines, feature stores, model training infrastructure, or serving systems. Covers the complete lifecycle from data ingestion to model deployment and monitoring.
Run an interactive system design mock interview - simulates a real interview with problem statement, follow-ups, and structured feedback
Use when implementing service-to-service security, mTLS, or service mesh patterns. Covers mutual TLS, Istio, Linkerd, certificate management, and service mesh security configurations.
Use when designing globally distributed systems, multi-region architectures, or disaster recovery strategies. Covers region selection, active-active vs active-passive, data replication, and failover patterns.
Use when implementing observability strategy, correlating signals, or designing monitoring systems. Covers the three pillars (logs, metrics, traces) and their integration.
Get LLM optimization recommendations for serving latency, inference costs, and throughput improvements
Design Internal Developer Platforms, self-service capabilities, and golden paths
The "-ilities" framework for non-functional requirements. Use when defining NFRs, evaluating architecture trade-offs, or ensuring quality attributes are addressed in system design. Covers scalability, reliability, availability, performance, security, maintainability, and more.
Retrieval-Augmented Generation (RAG) system design patterns, chunking strategies, embedding models, retrieval techniques, and context assembly. Use when designing RAG pipelines, improving retrieval quality, or building knowledge-grounded LLM applications.
Design a RAG architecture for a use case
Use when implementing rate limiting, throttling, or API quotas. Covers algorithms like token bucket and sliding window, plus distributed rate limiting patterns.
Use when implementing circuit breakers, retries, bulkheads, or other resilience patterns. Covers failure handling strategies for distributed systems.
Use when designing secret storage, rotation, or credential management systems. Covers HashiCorp Vault patterns, AWS Secrets Manager, Azure Key Vault, secret rotation, and zero-knowledge architectures.
Perform a security architecture review with Zero Trust assessment - identifies authentication/authorization gaps, data protection issues, and provides remediation guidance
Use when designing infrastructure self-service portals, IaC templates, or automated provisioning systems. Covers Terraform modules, Pulumi, environment provisioning, and infrastructure guardrails.
Use when defining SLOs, selecting SLIs, or implementing error budget policies. Covers reliability targets, SLI selection, and error budget management.
Interactive SLO definition workshop - guides through defining SLIs, setting SLO targets, and establishing error budget policies for a service
Use when designing real-time data processing systems, choosing stream processing frameworks, or implementing event-driven architectures. Covers Kafka, Flink, and streaming patterns.
Vector database selection, embedding storage, approximate nearest neighbor (ANN) algorithms, and vector search optimization. Use when choosing vector stores, designing semantic search, or optimizing similarity search performance.
Use when designing security architectures, implementing zero trust principles, or evaluating security posture. Covers never trust always verify, microsegmentation, identity-based access, and ZTNA patterns.
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Backend API design, GraphQL architecture, workflow orchestration with Temporal, and test-driven backend development
Complete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, and rules evolved over 10+ months of intensive daily use
Upstash Context7 MCP server for up-to-date documentation lookup. Pull version-specific documentation and code examples directly from source repositories into your LLM context.
Comprehensive startup business analysis with market sizing (TAM/SAM/SOM), financial modeling, team planning, and strategic research