From production-grade
Designs system architecture for tech stack, API contracts, data models, and infrastructure shape. Supports brownfield extensions and engagement modes from Express to Meticulous.
How this skill is triggered — by the user, by Claude, or both
Slash command
/production-grade:solution-architectThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
!`cat Claude-Production-Grade-Suite/.protocols/ux-protocol.md 2>/dev/null || true`
!cat Claude-Production-Grade-Suite/.protocols/ux-protocol.md 2>/dev/null || true
!cat Claude-Production-Grade-Suite/.protocols/input-validation.md 2>/dev/null || true
!cat Claude-Production-Grade-Suite/.protocols/tool-efficiency.md 2>/dev/null || true
!cat Claude-Production-Grade-Suite/.protocols/visual-identity.md 2>/dev/null || true
!cat Claude-Production-Grade-Suite/.protocols/freshness-protocol.md 2>/dev/null || true
!cat Claude-Production-Grade-Suite/.protocols/receipt-protocol.md 2>/dev/null || true
!cat Claude-Production-Grade-Suite/.protocols/boundary-safety.md 2>/dev/null || true
!cat Claude-Production-Grade-Suite/.protocols/conflict-resolution.md 2>/dev/null || true
!cat .production-grade.yaml 2>/dev/null || echo "No config — using defaults"
!cat Claude-Production-Grade-Suite/.orchestrator/codebase-context.md 2>/dev/null || true
Fallback (if protocols not loaded): Use AskUserQuestion with options (never open-ended), "Chat about this" last, recommended first. Work continuously. Print progress constantly. Validate inputs before starting — classify missing as Critical (stop), Degraded (warn, continue partial), or Optional (skip silently). Use parallel tool calls for independent reads. Use smart_outline before full Read.
If Claude-Production-Grade-Suite/.orchestrator/codebase-context.md exists and mode is brownfield:
!cat Claude-Production-Grade-Suite/.orchestrator/settings.md 2>/dev/null || echo "No settings — using Standard"
Read Claude-Production-Grade-Suite/.orchestrator/settings.md at startup. Adapt discovery depth:
| Mode | Discovery Approach |
|---|---|
| Express | Auto-derive from BRD. Ask only if critical info missing. Conservative defaults. |
| Standard | 5-7 questions across 2 rounds. Scale sizing + constraints. Fitness-derived architecture. |
| Thorough | 12-15 questions across 4 structured rounds. Full capacity planning. Trade-off analysis. Architecture alternatives. |
| Meticulous | Everything in Thorough + individual ADR approval, tech stack walkthrough, capacity modeling with cost estimates. |
Follow Claude-Production-Grade-Suite/.protocols/visual-identity.md. Print structured progress throughout execution.
Skill header (print on start):
━━━ Solution Architect ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase progress (print during execution):
[1/5] Constraint Discovery
✓ Scale: {users}, {CCU}, {constraints}
⧖ analyzing compliance requirements...
○ fitness function
[2/5] Architecture Design
✓ Pattern: {pattern}, {N} ADRs
⧖ generating system diagrams...
○ user review
[3/5] API Contracts
✓ {N} OpenAPI specs, {M} endpoints
⧖ defining error schemas...
○ versioning strategy
[4/5] Data Model
✓ ERD: {N} entities, {M} migrations
⧖ writing migration files...
○ audit trail schema
[5/5] Scaffold
✓ Project structure generated
⧖ writing Dockerfiles...
○ docker-compose
Completion summary (print on finish — MUST include concrete numbers):
✓ Solution Architect {pattern}, {N} ADRs, {M} endpoints, scaffold generated ⏱ Xm Ys
Full architecture pipeline: from business requirements to a scaffolded, production-ready codebase. The architecture is DERIVED from project constraints (scale, team, budget, compliance) — not picked from a template. There is no one-size-fits-all architecture.
Generates architecture deliverables at the project root (api/, schemas/, docs/architecture/, project scaffold) with workspace artifacts in Claude-Production-Grade-Suite/solution-architect/.
Read .production-grade.yaml at startup. Use these overrides if defined:
paths.api_contracts — default: api/paths.adrs — default: docs/architecture/architecture-decision-records/paths.architecture_docs — default: docs/architecture/paths.erd — default: schemas/erd.mdpaths.migrations — default: schemas/migrations/paths.tech_stack — default: docs/architecture/tech-stack.mdDeliverables go to the project root (api/, schemas/, docs/architecture/). Workspace artifacts go to Claude-Production-Grade-Suite/solution-architect/.
digraph sa {
rankdir=TB;
"Triggered" [shape=doublecircle];
"Phase 1: Discovery" [shape=box];
"Phase 2: Architecture Design" [shape=box];
"Phase 3: Tech Stack" [shape=box];
"Phase 4: API Contracts" [shape=box];
"Phase 5: Data Models" [shape=box];
"Phase 6: Scaffold" [shape=box];
"User Review" [shape=diamond];
"Suite Complete" [shape=doublecircle];
"Triggered" -> "Phase 1: Discovery";
"Phase 1: Discovery" -> "Phase 2: Architecture Design";
"Phase 2: Architecture Design" -> "User Review";
"User Review" -> "Phase 2: Architecture Design" [label="revise"];
"User Review" -> "Phase 3: Tech Stack" [label="approved"];
"Phase 3: Tech Stack" -> "Phase 4: API Contracts";
"Phase 4: API Contracts" -> "Phase 5: Data Models";
"Phase 5: Data Models" -> "Phase 6: Scaffold";
"Phase 6: Scaffold" -> "Suite Complete";
}
The architecture must fit the project's actual constraints. This phase gathers those constraints — at a depth matching the engagement mode.
Before asking ANY questions, read in parallel:
Claude-Production-Grade-Suite/polymath/handoff/context-package.md — may contain scale, constraints, decisionsClaude-Production-Grade-Suite/product-manager/BRD/brd.md — user stories, acceptance criteria, business rulesClaude-Production-Grade-Suite/.orchestrator/codebase-context.md — brownfield contextReduce questions to cover ONLY gaps not addressed in existing context. If polymath or PM already established scale targets, do not re-ask.
Adapt depth to engagement mode. Use AskUserQuestion with structured options (never open-ended).
Skip interview entirely. Auto-derive from BRD signals:
✓ Express mode — auto-deriving architecture from BRDIf a critical constraint is completely missing (e.g., BRD mentions "enterprise customers" but no scale number), ask ONE clarifying question maximum.
Round 1 — Scale & Users:
AskUserQuestion(questions=[{
"question": "I need to understand your scale to design the right architecture.\n\n"
"These 3 questions determine whether you need a simple monolith or a distributed system.",
"header": "Scale & Users",
"options": [
{"label": "Small scale — < 1K users, MVP or internal tool", "description": "Simple architecture, minimal infra, fast to build"},
{"label": "Medium scale — 1K-100K users, startup/growth", "description": "Needs to scale but not from day 1. Service extraction plan."},
{"label": "Large scale — 100K+ users, high availability", "description": "Distributed architecture, multi-region, serious infrastructure"},
{"label": "Not sure — help me estimate", "description": "I'll ask a few questions to figure this out"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
Follow up with:
AskUserQuestion(questions=[{
"question": "What's the primary data pattern?",
"header": "Data Characteristics",
"options": [
{"label": "Read-heavy — dashboards, content, catalogs", "description": "Cache-first, read replicas, CDN"},
{"label": "Write-heavy — logging, IoT, transactions", "description": "Queue-based, event sourcing, eventual consistency"},
{"label": "Balanced — typical CRUD SaaS", "description": "Standard request/response, relational DB"},
{"label": "Real-time — chat, collaboration, live updates", "description": "WebSocket/SSE, pub/sub, in-memory state"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
Round 2 — Constraints:
AskUserQuestion(questions=[{
"question": "Who will build and maintain this system?",
"header": "Team & Budget",
"options": [
{"label": "Solo or pair — keep it simple", "description": "Monolith, managed services, minimal ops"},
{"label": "Small team (3-5) — some specialization", "description": "Can handle moderate complexity"},
{"label": "Medium team (6-15) — dedicated roles", "description": "Can support microservices if needed"},
{"label": "Large team (15+) — multiple squads", "description": "Service ownership model, independent deploys"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
AskUserQuestion(questions=[{
"question": "Any hard constraints?",
"header": "Compliance & Deployment",
"options": [
{"label": "No special requirements", "description": "Standard web app, no regulatory burden"},
{"label": "GDPR — EU user data", "description": "Data residency, right to deletion, consent management"},
{"label": "SOC2 / ISO 27001 — enterprise customers", "description": "Audit trails, access controls, security policies"},
{"label": "HIPAA — health data", "description": "BAA required, encryption everywhere, dedicated tenancy"},
{"label": "PCI DSS — payment data", "description": "Tokenization, network segmentation, quarterly scans"},
{"label": "Multiple / Other (specify)", "description": "Select to describe your requirements"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
Everything in Standard, PLUS two additional rounds:
Round 3 — Technical Requirements:
AskUserQuestion(questions=[{
"question": "Let's get precise about performance and availability requirements.",
"header": "Performance & Availability",
"options": [
{"label": "Standard SaaS — 99.9% uptime, < 500ms API response", "description": "8.7 hours downtime/year. Typical for most web apps."},
{"label": "High availability — 99.99% uptime, < 200ms response", "description": "52 minutes downtime/year. Requires multi-AZ, automated failover."},
{"label": "Mission critical — 99.999% uptime, < 100ms response", "description": "5 minutes downtime/year. Requires multi-region, chaos engineering."},
{"label": "Internal tool — best effort, availability not critical", "description": "Simplest architecture, no redundancy required."},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
AskUserQuestion(questions=[{
"question": "Where are your users?",
"header": "Geographic Distribution",
"options": [
{"label": "Single country", "description": "One region deployment, simplest"},
{"label": "Single continent", "description": "One region with CDN for static assets"},
{"label": "Global — users everywhere", "description": "Multi-region, edge CDN, data replication strategy"},
{"label": "Not sure yet", "description": "I'll design for single-region with a multi-region migration path"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
AskUserQuestion(questions=[{
"question": "Expected peak concurrent users (CCU)?",
"header": "Peak Load",
"options": [
{"label": "< 100 CCU", "description": "Single instance can handle this"},
{"label": "100-1K CCU", "description": "Horizontal scaling, load balancer needed"},
{"label": "1K-10K CCU", "description": "Auto-scaling, connection pooling, caching layer"},
{"label": "10K+ CCU", "description": "Distributed architecture, queue-buffered writes, edge computing"},
{"label": "Help me estimate", "description": "Typically 5-10% of total users are concurrent at peak"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
Round 4 — Strategic:
AskUserQuestion(questions=[{
"question": "How do you see this system evolving?",
"header": "Growth & Extensibility",
"options": [
{"label": "Steady linear growth", "description": "Predictable scaling, plan for 10x over 2 years"},
{"label": "Hockey stick — potential viral growth", "description": "Must handle 100x spikes, auto-scaling critical"},
{"label": "Seasonal — predictable traffic spikes", "description": "Scale-to-zero between peaks, burst capacity"},
{"label": "Platform play — third parties will build on this", "description": "Public API, webhooks, rate limiting, developer portal"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
AskUserQuestion(questions=[{
"question": "Monthly infrastructure budget ceiling?",
"header": "Budget",
"options": [
{"label": "Minimal — under $500/mo", "description": "Serverless, managed DBs, free tiers. Optimize for cost."},
{"label": "Moderate — $500 to $5K/mo", "description": "Managed K8s, dedicated DBs, standard monitoring."},
{"label": "Significant — $5K+/mo", "description": "Dedicated infra, custom observability, multi-region."},
{"label": "Not a constraint", "description": "Optimize for performance and reliability, not cost."},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
AskUserQuestion(questions=[{
"question": "Cloud strategy?",
"header": "Vendor & Portability",
"options": [
{"label": "All-in on AWS (cheapest, most managed services)", "description": "Use AWS-native services. Fast to build, harder to migrate."},
{"label": "All-in on GCP (best for data/ML workloads)", "description": "Use GCP-native services. Strong managed K8s."},
{"label": "All-in on Azure (best for enterprise/Microsoft shops)", "description": "Use Azure-native services. AD integration."},
{"label": "Cloud-agnostic (most portable, higher upfront cost)", "description": "Terraform abstractions, avoid proprietary services."},
{"label": "Not sure — recommend based on my project", "description": "I'll recommend based on your requirements"},
{"label": "Chat about this", "description": "Free-form input"}
],
"multiSelect": false
}])
Everything in Thorough, PLUS:
After gathering inputs, DERIVE the architecture from constraints. The architecture is a FUNCTION of the inputs — not a template.
Architecture Pattern:
| Scale | Team | -> Pattern |
|---|---|---|
| < 1K users | 1-3 people | Monolith or Modular Monolith. Single deploy, single DB. Docker Compose for local dev. |
| 1K-100K users | 3-15 people | Modular Monolith with documented service boundaries. Extract services ONLY when team or scale demands. Include service extraction plan in ADR. |
| 100K+ users | 15+ people | Microservices. Service mesh, distributed data, event-driven communication. Each team owns 1-3 services. |
| Any scale | Solo developer | Whatever is simplest. Serverless or monolith. Managed everything. Minimize operational burden. |
Infrastructure Sizing:
| Budget | -> Infrastructure Strategy |
|---|---|
| < $500/mo | Serverless-first (Lambda/Cloud Run), managed DB (RDS free tier/PlanetScale), no K8s, CloudWatch/basic monitoring |
| $500-5K/mo | Managed K8s (EKS/GKE) or ECS, managed DB with replicas, Redis cache, standard monitoring (Grafana/Datadog) |
| > $5K/mo | Dedicated infrastructure, self-hosted options viable, custom observability stack, multi-region possible |
Data Architecture:
| Data Pattern | -> Strategy |
|---|---|
| Read-heavy (>80% reads) | Cache-first (Redis), read replicas, CDN for static, materialized views |
| Write-heavy | Event sourcing or CQRS, queue-buffered writes (SQS/Kafka), eventual consistency |
| Real-time | WebSocket/SSE infrastructure, pub/sub (Redis Pub/Sub or Kafka), in-memory state |
| Balanced CRUD | Standard relational DB, connection pooling, query optimization |
Compliance Impact:
| Requirement | -> Architecture Changes |
|---|---|
| GDPR | Data residency controls, right-to-deletion pipeline, consent management, PII encryption |
| SOC2 / ISO 27001 | Audit trail on all mutations, RBAC, centralized logging, access review automation |
| HIPAA | Dedicated tenancy, encryption at rest + transit, BAA with all vendors, audit logging, no shared infrastructure |
| PCI DSS | Tokenize card data (use Stripe/Adyen), network segmentation, quarterly vulnerability scans, no raw card storage |
Availability Impact:
| SLA | -> Architecture Changes |
|---|---|
| 99% (3.7 days/yr) | Single instance OK, basic health checks |
| 99.9% (8.7 hrs/yr) | Multi-AZ, load balancer, automated restarts, basic monitoring |
| 99.99% (52 min/yr) | Multi-AZ with automated failover, zero-downtime deploys, chaos engineering, comprehensive monitoring |
| 99.999% (5 min/yr) | Multi-region active-active, global load balancing, circuit breakers everywhere, dedicated SRE |
Growth Model Impact:
| Growth | -> Architecture Changes |
|---|---|
| Linear/steady | Plan for 10x. Vertical scaling first, horizontal when needed. |
| Hockey stick | Horizontal scaling from day 1. Stateless services. Auto-scaling groups. Queue-buffered writes. Feature flags for load shedding. |
| Seasonal | Scale-to-zero capable (serverless/spot instances). Pre-warming automation. Burst capacity planning. |
| Platform/API | API gateway, rate limiting, webhook system, developer portal, backwards-compatible versioning from day 1. |
Present the derived architecture: "Based on your constraints [summary], here's what fits and why..."
For Thorough/Meticulous modes, also present 1-2 alternative architectures:
Each alternative includes a trade-off summary: build time, operational complexity, monthly cost estimate, scaling ceiling, team fit.
Generate architecture documents in docs/architecture/ (or paths.architecture_docs from config):
One ADR per major decision using this template:
# ADR-NNN: [Title]
**Status:** Accepted | Superseded | Deprecated
**Context:** Why this decision is needed
**Decision:** What we chose and why
**Consequences:** Trade-offs accepted
**Alternatives Considered:** What we rejected and why
Required ADRs:
Create Mermaid diagrams in markdown files:
Apply and document these production patterns:
Present architecture to user via AskUserQuestion for approval before proceeding.
Generate docs/architecture/tech-stack.md (or paths.tech_stack from config):
| Layer | Selection | Rationale |
|---|---|---|
| Language(s) | Based on team/requirements | Performance, ecosystem, hiring |
| Framework | Based on language choice | Maturity, community, features |
| Database(s) | Based on data patterns | ACID vs BASE, query patterns |
| Cache | Redis/Memcached | Access patterns, consistency needs |
| Message Broker | Kafka/RabbitMQ/SQS/Pub-Sub | Throughput, ordering, durability |
| API Gateway | Kong/AWS API GW/GCP API GW | Rate limiting, auth, routing |
| Auth | Keycloak/Auth0/Cognito/Firebase Auth | SSO, MFA, compliance |
| Search | Elasticsearch/OpenSearch | Full-text, analytics, scale |
| Object Storage | S3/GCS/Azure Blob | Cost, lifecycle, CDN integration |
| CDN | CloudFront/Cloud CDN/Azure CDN | Edge locations, cost |
Selection criteria: production maturity, multi-cloud portability, team expertise, cost at scale.
Generate API contracts at api/ (or paths.api_contracts from config) at the project root:
Standards enforced:
{code, message, details, trace_id})cursor-based for production, offset only for admin)X-RateLimit-*)X-Request-ID)Generate data models at schemas/ at the project root:
paths.erd from config, default schemas/erd.md)paths.migrations from config, default schemas/migrations/)Standards enforced:
deleted_at timestampsScaffold the project root structure directly. The scaffold IS the project root — there is no separate scaffold directory.
project root/
├── services/
│ └── <service-name>/
│ ├── src/
│ ├── tests/
│ ├── Dockerfile
│ ├── Makefile
│ └── README.md
├── libs/
│ └── shared/ # Shared types, utils, clients
├── docker-compose.yml # Local dev environment
├── Makefile # Root-level commands
└── README.md # Getting started guide
Each service includes:
/healthz, /readyz)docs/architecture/
│ ├── architecture-decision-records/
│ │ ├── ADR-001-architecture-pattern.md
│ │ └── ...
│ ├── system-diagrams/
│ │ ├── c4-context.md
│ │ ├── c4-container.md
│ │ └── sequence-*.md
│ ├── tech-stack.md
│ └── design-principles.md
api/
│ ├── openapi/
│ │ └── *.yaml
│ ├── grpc/
│ │ └── *.proto
│ └── asyncapi/
│ └── *.yaml
schemas/
│ ├── erd.md
│ ├── migrations/
│ │ └── *.sql
│ └── data-flow.md
services/ # Scaffolded service directories
│ └── <service-name>/
│ ├── src/
│ ├── tests/
│ ├── Dockerfile
│ └── Makefile
libs/shared/
docker-compose.yml
Makefile
README.md
Claude-Production-Grade-Suite/solution-architect/)Claude-Production-Grade-Suite/solution-architect/
├── working-notes.md
└── analysis/
└── *.md
| Mistake | Fix |
|---|---|
| Picking architecture before knowing constraints | Run the fitness function FIRST. Scale, team, budget determine the pattern. |
| Microservices for a 2-person team | Start modular monolith, extract services when team/scale demands |
| Kubernetes for < 1K users | Docker Compose or serverless. K8s operational cost > benefit at small scale. |
| Same architecture for $200/mo and $20K/mo | Budget changes everything — serverless vs dedicated, managed vs self-hosted |
| Shared database across services | Each service owns its data, communicate via APIs/events |
| No API versioning strategy | Decide v1 URL path versioning from day one |
| Skipping ADRs | Future-you needs to know WHY, not just WHAT |
| Over-engineering auth | Use managed auth (Auth0/Cognito) unless compliance requires self-hosted |
| Ignoring multi-tenancy from start | Retrofitting tenant isolation is 10x harder than designing it in |
| Skipping scale interview | "Build a SaaS" means nothing without scale context. 100 users vs 10M users is a completely different system. |
| Ignoring engagement mode | Express: auto-derive. Standard: 2 rounds. Thorough: 4 rounds. Meticulous: full walkthrough. Read settings.md. |
| Designing for 10M users when there are 100 | Design for current + 10x. Not 1000x. Over-engineering kills velocity. |
| Not presenting alternatives in Thorough/Meticulous | Users at those engagement levels want to understand trade-offs, not just see one answer. |
npx claudepluginhub nagisanzenin/claude-code-production-grade-pluginDesigns system architecture and high-level technical strategy. Use for new systems or subsystems, major refactors, technology selections, system boundaries, and long-term decisions with broad impact.
Transforms product requirements into executable technical architecture with tech stack selection, system design, deployment strategies, and architecture reviews.
Transforms product requirements into executable technical architecture via phased workflow for system design, tech stack selection, deployment strategy, and review.