Design scalable distributed systems architectures for APIs, data pipelines, ML/RAG, edge/CDN, chaos engineering, and observability; review for security, resilience, performance, and quality attributes; simulate mock system design interviews with feedback.
npx claudepluginhub melodic-software/claude-code-plugins --plugin systems-designReview API designs for best practices, consistency, and common issues. PROACTIVELY use when reviewing OpenAPI specs, API endpoints, or REST/GraphQL/gRPC designs.
PROACTIVELY use when estimating system capacity, calculating QPS/storage/bandwidth, or sizing infrastructure. Provides guided back-of-envelope calculations with step-by-step breakdowns.
PROACTIVELY use when designing chaos experiments, planning GameDays, or improving system resilience. Helps identify failure modes, design fault injection experiments, and validate resilience patterns like circuit breakers, retries, and bulkheads.
Design data pipelines, recommend data architecture patterns, and analyze data flow requirements. PROACTIVELY use when designing ETL/ELT pipelines, data lakes, or streaming architectures.
PROACTIVELY use when designing CDN strategies, edge deployment architectures, or optimizing global latency. Helps design content delivery, edge compute placement, multi-region deployment, and geographic routing strategies.
PROACTIVELY use when optimizing LLM serving latency, reducing inference costs, or improving throughput. Provides quick recommendations for LLM performance optimization.
PROACTIVELY use for ML system design interview practice. Simulates a senior interviewer conducting ML-focused system design interviews with realistic follow-ups and feedback.
PROACTIVELY use when designing end-to-end ML systems, feature stores, training pipelines, or model serving infrastructure. Provides architectural guidance for production ML systems.
PROACTIVELY use when defining SLOs, designing monitoring strategies, or implementing observability. Helps design comprehensive observability approaches including SLI selection, SLO targets, error budgets, alerting strategies, and the three pillars (logs, metrics, traces).
PROACTIVELY use when designing Internal Developer Platforms, building platform teams, or improving developer experience. Helps design platform architecture, self-service capabilities, golden paths, and infrastructure provisioning systems.
PROACTIVELY use when designing RAG systems, choosing embedding strategies, optimizing retrieval quality, or building knowledge-grounded LLM applications. Provides architectural guidance for RAG pipelines.
PROACTIVELY use when reviewing architecture for security gaps, performing zero trust assessments, or evaluating security posture. Analyzes designs for security vulnerabilities, authentication/authorization gaps, data protection issues, and provides remediation guidance.
PROACTIVELY use for senior/staff+ level system design interview practice. Simulates a principal engineer or director conducting rigorous interviews with deep probing, pushback on assumptions, and high-bar evaluation.
Use when designing APIs, choosing between REST/GraphQL/gRPC, or understanding API design best practices. Covers protocol selection, resource modeling, and API patterns.
Review API design for best practices, consistency, and common issues
Use when implementing API authentication, authorization, or security patterns. Covers OAuth 2.0, OIDC, JWT, API keys, rate limiting, and common API security vulnerabilities.
Use when planning API versioning strategy, handling breaking changes, or managing API deprecation. Covers URL, header, and query parameter versioning approaches.
Use when designing content delivery networks, caching strategies, or global content distribution. Covers CDN architecture, cache hierarchies, origin shielding, cache invalidation, and edge optimization.
Use when implementing chaos engineering, designing fault injection experiments, or building resilience testing practices. Covers chaos principles and experiment design.
Design chaos engineering experiments for a system - identifies failure modes, creates experiment hypotheses, and generates GameDay plans
Use when designing data platforms, choosing between data lakes/lakehouses/warehouses, or implementing data mesh patterns. Covers modern data architecture approaches.
Design data pipeline architecture for a given data flow scenario
Use when designing data models, database schemas, or choosing between modeling approaches. Covers dimensional modeling, star schema, data vault, entity-relationship design, and schema evolution.
4-step framework for system design interviews. Use when preparing for technical interviews, practicing whiteboard design, or structuring architectural discussions. Covers requirements gathering, high-level design, deep dives, and wrap-up.
Use when implementing distributed tracing, understanding trace propagation, or debugging cross-service issues. Covers OpenTelemetry, span context, and trace correlation.
Use when designing edge computing architectures, serverless at edge, or distributed compute strategies. Covers edge functions, compute placement decisions, Cloudflare Workers, Lambda@Edge, and edge-native patterns.
Design CDN and edge deployment strategy for global distribution - optimizes latency, plans caching architecture, and recommends edge compute placement
Back-of-envelope calculations for system design. Use when estimating QPS, storage, bandwidth, or latency for capacity planning. Includes latency numbers every programmer should know and common estimation patterns.
Use when designing data pipelines, choosing between ETL and ELT approaches, or implementing data transformation patterns. Covers modern data pipeline architecture.
Explain a systems design concept
Use when planning GameDay exercises, designing failure scenarios, or conducting chaos drills. Covers GameDay preparation, execution, and follow-up.
Use when designing standardized development workflows, paved roads, or opinionated defaults. Covers golden path patterns, template design, developer workflow optimization, and guardrails.
Use when designing idempotent APIs, handling retries safely, or preventing duplicate operations. Covers idempotency keys, at-most-once semantics, and duplicate prevention.
Use when designing incident management processes, creating runbooks, or establishing on-call practices. Covers incident lifecycle, communication, and postmortems.
Plan instrumentation strategy before implementation, covering what to instrument, naming conventions, cardinality management, and instrumentation budget
Use when designing Internal Developer Platforms (IDPs), building platform teams, or improving developer experience. Covers platform engineering principles, Backstage, portal design, and platform team structures.
Calculate and allocate latency budgets for a system - breaks down end-to-end latency into component budgets with optimization recommendations
Use when optimizing end-to-end latency, reducing response times, or improving performance for latency-sensitive applications. Covers latency budgets, geographic routing, protocol optimization, and latency measurement techniques.
LLM inference infrastructure, serving frameworks (vLLM, TGI, TensorRT-LLM), quantization techniques, batching strategies, and streaming response patterns. Use when designing LLM serving infrastructure, optimizing inference latency, or scaling LLM deployments.
ML inference latency optimization, model compression, distillation, caching strategies, and edge deployment patterns. Use when optimizing inference performance, reducing model size, or deploying ML at the edge.
Design an ML system for a problem
End-to-end ML system design for production. Use when designing ML pipelines, feature stores, model training infrastructure, or serving systems. Covers the complete lifecycle from data ingestion to model deployment and monitoring.
Run an interactive system design mock interview - simulates a real interview with problem statement, follow-ups, and structured feedback
Use when implementing service-to-service security, mTLS, or service mesh patterns. Covers mutual TLS, Istio, Linkerd, certificate management, and service mesh security configurations.
Use when designing globally distributed systems, multi-region architectures, or disaster recovery strategies. Covers region selection, active-active vs active-passive, data replication, and failover patterns.
Use when implementing observability strategy, correlating signals, or designing monitoring systems. Covers the three pillars (logs, metrics, traces) and their integration.
Get LLM optimization recommendations for serving latency, inference costs, and throughput improvements
Design Internal Developer Platforms, self-service capabilities, and golden paths
The "-ilities" framework for non-functional requirements. Use when defining NFRs, evaluating architecture trade-offs, or ensuring quality attributes are addressed in system design. Covers scalability, reliability, availability, performance, security, maintainability, and more.
Retrieval-Augmented Generation (RAG) system design patterns, chunking strategies, embedding models, retrieval techniques, and context assembly. Use when designing RAG pipelines, improving retrieval quality, or building knowledge-grounded LLM applications.
Design a RAG architecture for a use case
Use when implementing rate limiting, throttling, or API quotas. Covers algorithms like token bucket and sliding window, plus distributed rate limiting patterns.
Use when implementing circuit breakers, retries, bulkheads, or other resilience patterns. Covers failure handling strategies for distributed systems.
Use when designing secret storage, rotation, or credential management systems. Covers HashiCorp Vault patterns, AWS Secrets Manager, Azure Key Vault, secret rotation, and zero-knowledge architectures.
Perform a security architecture review with Zero Trust assessment - identifies authentication/authorization gaps, data protection issues, and provides remediation guidance
Use when designing infrastructure self-service portals, IaC templates, or automated provisioning systems. Covers Terraform modules, Pulumi, environment provisioning, and infrastructure guardrails.
Use when defining SLOs, selecting SLIs, or implementing error budget policies. Covers reliability targets, SLI selection, and error budget management.
Interactive SLO definition workshop - guides through defining SLIs, setting SLO targets, and establishing error budget policies for a service
Use when designing real-time data processing systems, choosing stream processing frameworks, or implementing event-driven architectures. Covers Kafka, Flink, and streaming patterns.
Vector database selection, embedding storage, approximate nearest neighbor (ANN) algorithms, and vector search optimization. Use when choosing vector stores, designing semantic search, or optimizing similarity search performance.
Use when designing security architectures, implementing zero trust principles, or evaluating security posture. Covers never trust always verify, microsegmentation, identity-based access, and ZTNA patterns.
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimSystem design interview preparation and architecture review — structured frameworks for distributed systems
Enterprise microservices architecture design and implementation expert for scalable distributed systems
Editorial "Architecture & Design" bundle for Claude Code from Antigravity Awesome Skills.
Agents specialized in system architecture and solution design. Focuses on scalability, design patterns, and architectural decisions.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Comprehensive PR review agents specializing in comments, tests, error handling, type design, code quality, and code simplification
Plugins for Claude Code: documentation management, code quality, and ecosystem support.
fnm (Fast Node Manager) is the recommended Node.js version manager for this project. It:
Install fnm:
# Windows (PowerShell as Admin)
winget install Schniz.fnm
# macOS/Linux
curl -fsSL https://fnm.vercel.app/install | bash
Configure for Git Bash (add to ~/.bashrc):
eval "$(fnm env --use-on-cd --shell bash)"
Or source the setup script which includes fnm initialization:
source "/path/to/claude-code-plugins/setup/bashrc-claude.sh"
Install Node:
fnm install 24
fnm default 24
npm install
npm run lint:md # Check for errors
npm run lint:md:fix # Auto-fix errors
Markdown linting runs automatically on PRs via GitHub Actions. The same rules apply locally and in CI.
/plugin install claude-ecosystem@claude-code-plugins
/plugin install code-quality@claude-code-plugins
/plugin install google-ecosystem@claude-code-plugins
This repo expects Codex CLI configuration to live in user scope under ~/.codex.
See .codex/README.md for the canonical locations.
| Plugin | Purpose |
|---|---|
| atlassian | Atlassian MCP server: Jira, Confluence, Compass integration |
| browser-automation | Browser automation MCP servers: Chrome DevTools, Playwright |
| business-analysis | BABOK techniques: capability mapping, stakeholder analysis, value streams, journey mapping |
| ci-cd | CI/CD pipelines: GitHub Actions, deployment automation, release management |
| claude-code-observability | Event logging, metrics, session diagnostics |
| claude-ecosystem | Claude Code docs, meta-skills, hooks, observability, auditors |
| code-quality | Code review, markdown linting, debugging, CI/CD templates |
| compliance-planning | Regulatory compliance: GDPR, HIPAA, PCI-DSS, AI governance, ISO 27001 |
| content-management-system | Headless CMS architecture: content modeling, taxonomies, media, theming |
| cursor-ecosystem | Cursor IDE docs, CLI, agent, keyword-based search |
| documentation-standards | Technical docs: arc42, C4 model, ADRs, RFC process, docs-as-code |
| dotnet | .NET 10+ automation: build, clean, SDK/tool install, version upgrades, Aspire MCP |
| duende-ecosystem | Duende IdentityServer, BFF, IdentityModel docs |
| enterprise-architecture | TOGAF, Zachman, ADRs, cloud alignment |
| event-modeling | Event-driven design: Event Modeling, Event Storming, CQRS, sagas |
| figma | Figma MCP server: design context, code generation, design tokens |
| formal-specification | Formal methods: UML/SysML, TLA+, OpenAPI/AsyncAPI, state machines |
| git | Git config, GPG signing, hooks, GitHub issues, history exploration |
| google-ecosystem | Gemini CLI docs, Claude-to-Gemini integration, configuration management |
| melodic-software | Developer onboarding, environment setup, commit workflows |
| microsoft | Microsoft MCP servers: Microsoft Learn, Azure, NuGet, Azure DevOps |
| milan-jovanovic | Milan Jovanovic .NET patterns: Clean Architecture, DDD, CQRS, EF Core |
| openai-ecosystem | OpenAI Codex CLI docs |
| requirements-elicitation | Requirements gathering: LLMREI interviews, gap analysis, prioritization |
| research | Research workflows: MCP integration, multi-source synthesis, structured output |
| response-quality | Response quality standards, source citations |
| security | Security: OWASP, authentication, cryptography, DevSecOps, threat modeling, 12 skills |
| soft-skills | Career progression, interviews, communication, professional visibility |