Six-pillar well-architected framework for evaluating architecture quality — operational excellence, security, reliability, performance, cost optimization, and developer experience
Evaluates software architecture across six pillars and generates prioritized improvement roadmaps.
npx claudepluginhub navraj007in/architecture-cowork-pluginThis skill inherits all available tools. When active, it can use any tool Claude has access to.
A structured framework for evaluating architecture quality across six pillars. Inspired by the AWS Well-Architected Framework, adapted for startups and modern product development. Technology-agnostic — applies to any stack.
Use this skill when generating the Well-Architected Review deliverable in /architect:blueprint or when running /architect:well-architected as a standalone evaluation.
Question: Can you deploy, monitor, and improve the system with confidence?
| Criteria | What Good Looks Like | Score Guide |
|---|---|---|
| CI/CD pipeline | Automated lint → test → build → deploy on every push | 1 = manual deploys, 5 = full CI/CD with rollback |
| Infrastructure as Code | Deployment config in repo (Vercel config, Dockerfile, Terraform) | 1 = manual setup, 5 = fully reproducible |
| Observability | Structured logs + error tracking + health checks | 1 = console.log only, 5 = full observability stack |
| Incident response | Alerts → runbook → mitigation → post-mortem process | 1 = no plan, 5 = documented runbooks and on-call |
| Change management | Feature flags, staged rollouts, database migrations versioned | 1 = YOLO deploys, 5 = staged rollouts with flags |
Key questions to ask:
Question: Is user data protected, and are attack surfaces minimized?
| Criteria | What Good Looks Like | Score Guide |
|---|---|---|
| Authentication | Managed auth provider, MFA for admin, token rotation | 1 = DIY password hashing, 5 = managed auth + MFA |
| Authorization | Role-based or attribute-based access control on every endpoint | 1 = no authorization checks, 5 = RBAC/ABAC middleware |
| Data protection | Encryption at rest and in transit, PII identified and handled | 1 = plaintext everywhere, 5 = encrypted + PII policies |
| API security | Rate limiting, input validation, CORS, security headers | 1 = none, 5 = all OWASP top 10 mitigated |
| Secrets management | No hardcoded secrets, env vars or secrets manager, rotation policy | 1 = secrets in code, 5 = secrets manager + rotation |
| Dependency security | Automated vulnerability scanning in CI | 1 = never audited, 5 = automated audit + auto-fix |
Key questions to ask:
Question: Does the system keep working when things go wrong?
| Criteria | What Good Looks Like | Score Guide |
|---|---|---|
| Fault isolation | One service failing doesn't cascade to others | 1 = monolith with no error boundaries, 5 = circuit breakers + fallbacks |
| Recovery | Automated restarts, health checks, self-healing | 1 = manual restart, 5 = auto-restart + health checks + failover |
| Data durability | Automated backups, point-in-time recovery, tested restores | 1 = no backups, 5 = automated backups + tested restores |
| Retry & timeout | Retries with exponential backoff, timeouts on all external calls | 1 = no retries/timeouts, 5 = retry policies on all external calls |
| Graceful degradation | System works (reduced functionality) when a dependency is down | 1 = hard crash, 5 = graceful fallback for each dependency |
| Scaling | Handles 10x current load without architecture changes | 1 = breaks at 2x, 5 = auto-scaling with no code changes |
Key questions to ask:
Question: Are resources used efficiently, and is the user experience fast?
| Criteria | What Good Looks Like | Score Guide |
|---|---|---|
| Response time | API p95 < 500ms, page load < 2s, AI response < 5s | 1 = > 3s average, 5 = p95 < 200ms |
| Right-sizing | Resources match actual load (not over/under-provisioned) | 1 = fixed large instances, 5 = auto-scaled to demand |
| Caching | Appropriate caching at each layer (CDN, API, database) | 1 = no caching, 5 = multi-layer caching strategy |
| Async processing | Heavy work offloaded to background jobs / queues | 1 = everything synchronous, 5 = async where appropriate |
| Database efficiency | Indexed queries, connection pooling, no N+1 problems | 1 = unoptimized queries, 5 = optimized + monitored |
| Frontend performance | Code splitting, lazy loading, optimized images, CDN | 1 = single bundle, no CDN, 5 = optimized + CDN + edge |
Key questions to ask:
Question: Are you spending the minimum needed for the current scale?
| Criteria | What Good Looks Like | Score Guide |
|---|---|---|
| Free tier usage | Maximizing free tiers before paying | 1 = paying for everything, 5 = optimized free tier usage |
| Right-sizing | Resources match actual usage, not theoretical maximum | 1 = over-provisioned, 5 = auto-scaled or right-sized |
| Cost awareness | Team knows monthly cost breakdown, alerts on overspend | 1 = no idea of costs, 5 = cost dashboards + budget alerts |
| Service selection | Chosen services fit the scale (not enterprise tools for MVP) | 1 = enterprise tools for 10 users, 5 = appropriate for scale |
| LLM cost control | Token optimization, caching, model selection, rate limits | 1 = GPT-4 for everything, 5 = tiered models + prompt caching |
| Scaling economics | Costs scale sub-linearly with users | 1 = linear cost scaling, 5 = strong economies of scale |
Key questions to ask:
Question: Can developers build, test, and ship features quickly and confidently?
| Criteria | What Good Looks Like | Score Guide |
|---|---|---|
| Onboarding | New dev productive in < 1 day (README, setup scripts, seed data) | 1 = tribal knowledge, 5 = automated setup + docs |
| Local development | docker compose up or npm run dev and everything works | 1 = complex manual setup, 5 = one-command startup |
| Type safety | TypeScript/Python type hints throughout, shared types across services | 1 = untyped, 5 = strict types + shared type packages |
| Testing | Fast unit tests, meaningful integration tests, CI runs all tests | 1 = no tests, 5 = comprehensive test suite < 5 min |
| Code organization | Clear folder structure, consistent patterns, separation of concerns | 1 = spaghetti code, 5 = clear architecture + conventions |
| Documentation | API docs (OpenAPI), architecture docs, runbooks | 1 = no docs, 5 = auto-generated API docs + architecture docs |
Key questions to ask:
Each pillar is scored 1-5 based on the average of its criteria:
| Score | Label | Meaning |
|---|---|---|
| 1 | Critical | Fundamental gaps. Address immediately before building further. |
| 2 | Needs Work | Significant gaps. Plan to address in next sprint/milestone. |
| 3 | Adequate | Meets minimum bar. Acceptable for MVP, improve for production. |
| 4 | Good | Solid implementation. Minor improvements possible. |
| 5 | Excellent | Best practices followed. Ready for production scale. |
Average of all 6 pillar scores:
| Overall | Rating | Interpretation |
|---|---|---|
| 1.0 - 2.0 | Fragile | Architecture has critical gaps. Not production-ready. |
| 2.1 - 3.0 | Developing | Functional but risky. Acceptable for early MVP with a plan to improve. |
| 3.1 - 3.5 | Solid | Good foundation. Typical for well-planned MVP or early startup. |
| 3.6 - 4.0 | Strong | Production-quality. Ready for real users and growth. |
| 4.1 - 5.0 | Exemplary | Mature architecture. Enterprise-ready. Rare for early-stage products. |
Not every project needs a 5/5 on every pillar. Set expectations by stage:
| Stage | Target Score | Acceptable Lows |
|---|---|---|
| Proof of concept | 2.0 overall | Security 2, Reliability 1, DevEx 2 |
| MVP | 2.5 - 3.0 | Reliability 2, Performance 2 |
| Early product (paying users) | 3.0 - 3.5 | Performance 3, Cost 2 |
| Growth stage | 3.5 - 4.0 | None below 3 |
| Production / enterprise | 4.0+ | None below 4 |
IMPORTANT: Be comprehensive and thorough in all sections.
Visualize pillar scores as a horizontal bar table:
Operational Excellence ████░ 4/5 — Good
Security ███░░ 3/5 — Adequate
Reliability ██░░░ 2/5 — Needs Work
Performance Efficiency ████░ 4/5 — Good
Cost Optimization █████ 5/5 — Excellent
Developer Experience ███░░ 3/5 — Adequate
Overall: 3.5/5 — Solid
Score table (also provide as markdown table):
| Pillar | Score | Label | Key Strength | Critical Gap |
|---|---|---|---|---|
| Operational Excellence | X/5 | [Label] | [1 sentence] | [1 sentence or "None"] |
| Security | X/5 | [Label] | [1 sentence] | [1 sentence or "None"] |
| Reliability | X/5 | [Label] | [1 sentence] | [1 sentence or "None"] |
| Performance Efficiency | X/5 | [Label] | [1 sentence] | [1 sentence or "None"] |
| Cost Optimization | X/5 | [Label] | [1 sentence] | [1 sentence or "None"] |
| Developer Experience | X/5 | [Label] | [1 sentence] | [1 sentence or "None"] |
| Overall | X.X/5 | [Rating] |
For each of the 6 pillars, provide comprehensive analysis:
Format:
## [Pillar Name]: X/5 — [Label]
### Score Breakdown
Evaluated against [number] criteria:
| Criteria | Score | Notes |
|----------|:-----:|-------|
| [Criterion 1] | X/5 | [1 sentence why this score] |
| [Criterion 2] | X/5 | [1 sentence why this score] |
| [Criterion 3] | X/5 | [1 sentence why this score] |
| ... | ... | ... |
**Average: X.X/5 → X/5 ([Label])**
### Strengths (minimum 2-4 bullets)
✅ **[Specific strength]**
- What: [Describe what's implemented]
- Why it matters: [Business/technical impact]
- Reference: [Cite specific architecture decision from manifest]
✅ **[Specific strength]**
- What: [Describe what's implemented]
- Why it matters: [Business/technical impact]
- Reference: [Cite specific architecture decision from manifest]
[Continue for all major strengths...]
### Gaps (minimum 2-4 bullets if score < 5)
⚠️ **[Specific gap]**
- What's missing: [Describe the missing capability]
- Risk if not addressed: [Specific risk with impact estimation]
- Severity: [Critical/High/Medium/Low]
⚠️ **[Specific gap]**
- What's missing: [Describe the missing capability]
- Risk if not addressed: [Specific risk with impact estimation]
- Severity: [Critical/High/Medium/Low]
[Continue for all significant gaps...]
### Recommendations (minimum 3-5 specific actions)
Each recommendation must be:
- **Actionable**: Specific enough to implement immediately
- **Measured**: Include effort estimate and impact level
- **Prioritized**: Show order of implementation
**Format for each recommendation:**
**Recommendation #X: [Short action-oriented title]** — [Impact: High/Medium/Low] — [Effort: X hours/days/weeks]
What to do:
[2-3 sentences describing specific implementation steps]
Why it matters:
[1-2 sentences on business/technical impact]
Implementation notes:
- Tool/service to use: [Specific recommendation]
- Code location: [Where to implement, if applicable]
- Dependencies: [What must be done first]
- Definition of done: [How to verify it's complete]
Cost impact: $[amount]/month or [one-time cost]
Timeline: [When to implement — Pre-launch / Month 1 / Quarter 1 / Future]
**Example full recommendation:**
**Recommendation #1: Implement rate limiting on all API endpoints** — Impact: High — Effort: 4-6 hours
What to do:
Add rate limiting middleware using Vercel's @upstash/ratelimit package. Set default limit of 100 requests per minute per IP, with stricter limits (10 req/min) on authentication endpoints and looser limits (1000 req/min) for authenticated users. Return 429 status with Retry-After header when limit exceeded.
Why it matters:
Without rate limiting, a single malicious user or misconfigured client can overwhelm the API, causing downtime for all users. This is especially critical for authentication endpoints which are common DDoS targets. Rate limiting is a pre-launch requirement for production deployment.
Implementation notes:
- Tool/service to use: @upstash/ratelimit with Vercel KV (free tier: 10K requests/day)
- Code location: src/middleware/rateLimit.ts, apply in src/app/api/*/route.ts
- Dependencies: Set up Vercel KV store (5 minute setup)
- Definition of done: Rate limiting active on all endpoints, returns 429 when exceeded, logged in monitoring
Cost impact: $0/month (Vercel KV free tier sufficient for 10K users)
Timeline: Pre-launch (P0 — blocks production deployment)
### Stage-Appropriate Assessment
This architecture is at: **[Stage name]** stage
Expected score range for this stage: **X.X - Y.Y**
Actual score: **Z.Z**
Assessment: **[Above/At/Below]** expectations for this stage
[If below expectations:]
Critical gaps for this stage:
- [Gap 1 that's unacceptable for current stage]
- [Gap 2 that's unacceptable for current stage]
Must address before [milestone/launch].
[If at/above expectations:]
Well-positioned for [next stage]. Consider improving [pillar names] before scaling to [user count/revenue level].
Provide comprehensive prioritized roadmap with P0-P3 priority levels:
| Priority | Pillar | Action | Effort | Impact | Stage | Cost |
|---|---|---|---|---|---|---|
| P0 | Security | Add rate limiting to all API endpoints | 4-6 hours | High | Pre-launch | $0 |
| P0 | Reliability | Implement health checks with dependency verification | 3 hours | High | Pre-launch | $0 |
| P0 | Security | Set up automated dependency scanning in CI | 2 hours | High | Pre-launch | $0 |
| P1 | Operational Excellence | Add structured logging with correlation IDs | 1 day | Medium | Month 1 | $26/mo (Sentry) |
| P1 | Reliability | Set up database backups with tested restore process | 4 hours | High | Month 1 | $0 (included) |
| P1 | Security | Implement secrets rotation for API keys | 2 days | Medium | Month 1 | $0 |
| P2 | Performance | Add Redis caching layer for hot queries | 2-3 days | Medium | Quarter 1 | $7/mo |
| P2 | Operational Excellence | Implement feature flags for gradual rollouts | 1 day | Low | Quarter 1 | $0 (self-hosted) |
| P2 | Developer Experience | Add OpenAPI schema generation for API docs | 1 day | Low | Quarter 1 | $0 |
| P3 | Performance | Implement database query optimization and indexing review | 3-5 days | Medium | Future | $0 |
| P3 | Reliability | Add chaos engineering / failure injection testing | 1 week | Low | Future | $0 |
| P3 | Developer Experience | Set up comprehensive E2E test suite | 1-2 weeks | Medium | Future | $0 |
Priority Definitions:
Cost Summary:
Timeline Visualization:
Pre-Launch (Week 0):
├─ P0 items (total: X hours)
└─ Must complete before production deployment
Month 1 (Weeks 1-4):
├─ P1 items (total: X hours)
└─ Critical for stability
Quarter 1 (Weeks 5-12):
├─ P2 items (total: X hours)
└─ Improves operational efficiency
Future (Month 4+):
├─ P3 items (total: X hours)
└─ Evaluate based on growth
Identify 3-5 high-impact, low-effort improvements that can be done in <1 day each:
Format:
Quick Win #X: [Short title] — [Pillar Name]
What: [1-2 sentences on what to implement]
Effort: [X hours]
Impact: [High/Medium impact on pillar score]
How: [3-5 step implementation checklist]
Cost: $[amount or $0]
ROI: [Pillar score improvement: X/5 → Y/5, or specific metric improvement]
Example:
Quick Win #1: Add security headers to API responses — Security
What: Configure Next.js security headers (CSP, X-Frame-Options, HSTS, etc.) in next.config.js to protect against common web vulnerabilities.
Effort: 30 minutes
Impact: High (improves Security pillar from 3/5 to 3.5/5)
How:
1. Add headers configuration to next.config.js
2. Test with securityheaders.com
3. Verify CSP doesn't break any functionality
4. Deploy to staging and production
Cost: $0
ROI: Security pillar: 3/5 → 3.5/5, protects against XSS and clickjacking with minimal effort
If any pillar scores 1/5 or has critical gaps, call them out explicitly:
🚨 CRITICAL BLOCKER: [Issue name]
Pillar: [Name]
Current state: [What's broken or missing]
Risk: [What bad thing will happen]
Probability: [High/Medium/Low chance of occurrence]
Impact: [Severity if it occurs — data loss, security breach, downtime, etc.]
This blocks: [Production launch / Scaling / User trust / Compliance]
Required action: [Specific fix needed]
Effort: [Realistic time estimate]
Must complete by: [Deadline or stage gate]
Owner: [Who should do this — role/skill level]
Example:
🚨 CRITICAL BLOCKER: No database backups configured
Pillar: Reliability
Current state: Supabase project has default backups (daily, 7-day retention) but no tested restore process. No way to recover from accidental data deletion or corruption.
Risk: Single developer mistake (DROP TABLE, bad migration) or Supabase issue could cause permanent data loss for all users.
Probability: Medium (10-15% chance in first year based on industry data)
Impact: Catastrophic — lose all user data, company trust, potential legal liability
This blocks: Production launch with real users
Required action:
1. Enable Supabase point-in-time recovery (PITR) — provides 7-day recovery window
2. Set up daily automated backup export to S3 (in addition to Supabase backups)
3. Document and TEST restore procedure (actually restore a backup to verify it works)
4. Add backup monitoring (alert if backup fails)
Effort: 4-6 hours (2 hours setup + 2 hours testing + 1 hour documentation)
Must complete by: Before production launch (P0)
Owner: Backend developer or DevOps engineer
Show how the overall score was calculated:
Well-Architected Score Calculation:
Operational Excellence: X/5
Security: X/5
Reliability: X/5
Performance Efficiency: X/5
Cost Optimization: X/5
Developer Experience: X/5
─────
Total: XX/30
Average: XX/30 ÷ 6 = X.XX → X.X/5
Overall Rating: X.X/5 — [Rating Label]
Interpretation based on stage:
Stage: [MVP / Early Product / Growth / Enterprise]
Expected score range: [X.X - Y.Y]
Actual score: [Z.Z]
Assessment: [Above/At/Below] expectations
[If below]: Priority focus areas: [list pillars scoring below stage expectations]
[If at/above]: Continue improving [lowest scoring pillars] while scaling
When used with /architect:well-architected on an existing architecture (not a new blueprint):
Expert guidance for Next.js Cache Components and Partial Prerendering (PPR). **PROACTIVE ACTIVATION**: Use this skill automatically when working in Next.js projects that have `cacheComponents: true` in their next.config.ts/next.config.js. When this config is detected, proactively apply Cache Components patterns and best practices to all React Server Component implementations. **DETECTION**: At the start of a session in a Next.js project, check for `cacheComponents: true` in next.config. If enabled, this skill's patterns should guide all component authoring, data fetching, and caching decisions. **USE CASES**: Implementing 'use cache' directive, configuring cache lifetimes with cacheLife(), tagging cached data with cacheTag(), invalidating caches with updateTag()/revalidateTag(), optimizing static vs dynamic content boundaries, debugging cache issues, and reviewing Cache Component implementations.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.