Skill

nw-production-readiness

Monitoring, observability, operational procedures, CI/CD lessons learned, and quality gate definitions. Load when assessing production readiness or validating operational excellence.

From nw
Install
1
Run in your terminal
$
npx claudepluginhub nwave-ai/nwave --plugin nw
Tool Access

This skill uses the workspace's default tool permissions.

Skill Content

Production Readiness

Monitoring and Observability

Application Monitoring

  • Performance: response time | throughput | latency percentiles (P50, P95, P99)
  • Resources: CPU | memory | database connections | cache hit rates
  • Errors: exception tracking | error rate trends | integration failure detection
  • Business: KPI tracking | conversion funnels | feature usage | revenue impact

Infrastructure Monitoring

Server/container health and resource utilization | Network performance and connectivity | Storage capacity and I/O performance | Security event detection.

Alerting Tiers

TierConditionResponse
PageService down, data loss risk, security breachImmediate response
UrgentError rate >2x baseline, latency SLA breachResponse within 15 min
WarningCapacity >80%, error rate trending upResponse within 1 hour
InfoDeployment complete, metric threshold crossedReview next business day

Operational Procedures

Incident Response

  1. Detect: automated alerting identifies issue
  2. Triage: classify severity, assign responder
  3. Communicate: notify stakeholders per severity level
  4. Resolve: apply fix or rollback
  5. Review: post-incident review within 48 hours
  6. Improve: update runbooks and monitoring based on findings

Maintenance Procedures

Regular update and patching schedule | Backup verification (test restores quarterly) | Security vulnerability scanning (automated, weekly) | Performance baseline recalibration (after major changes).

Knowledge Transfer

Operational runbooks for common procedures | Architecture documentation with system diagrams | Deployment procedures and configuration management | Troubleshooting guides for known failure modes.

Quality Gates for Production Readiness

Before declaring production-ready, all must pass:

  • All acceptance tests passing
  • Unit coverage meets project standard (default: >= 80%)
  • Integration tests validated
  • Performance validated under realistic load
  • Security scan completed (0 critical, 0 high)
  • Monitoring and alerting configured
  • Logging structured and searchable
  • Rollback procedure documented and tested
  • Runbook created for operational procedures
  • On-call team trained on new feature

For CI/CD architecture lessons and measurement coupling pitfalls, see cicd-and-deployment skill.

Stats
Parent Repo Stars299
Parent Repo Forks37
Last CommitMar 20, 2026