Skill

deployment-patterns

This skill should be used when implementing deployment strategies, configuring autoscaling, setting up probes, managing disruption budgets, or designing zero-downtime deployments.

npx claudepluginhub jsamuelsen11/claude-config --plugin ccfg-kubernetes

Tool Access

This skill uses the workspace's default tool permissions.

Preview

This skill provides comprehensive patterns for implementing production-grade deployments with

SKILL.md

Similar Skills

kubernetes-deployment-patterns

Provides Kubernetes deployment strategies (rolling, blue-green, canary), workload types (Deployment, StatefulSet), config management, resource limits, and autoscaling (HPA, VPA) for production apps.

5 files

nickcrew-claude-ctx-plugin

deploy-to-kubernetes

Deploys containerized apps to Kubernetes clusters via kubectl manifests for Deployments, Services, ConfigMaps, Secrets, Ingress. Adds health checks, resource limits, rolling updates, Helm charts for EKS, GKE, AKS, Docker Compose migrations.

1 file1 tool

agent-almanac

creating-kubernetes-deployments

1.9k

Generates production-ready Kubernetes manifests for Deployments, Services, Ingress, HPA, ConfigMaps, Secrets with health checks, resource limits, auto-scaling, TLS.

20 files6 tools

kubernetes-deployment-creator

Stats

Parent Repo Stars0

Parent Repo Forks0

Last CommitFeb 10, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Kubernetes Deployment Patterns

This skill provides comprehensive patterns for implementing production-grade deployments with zero-downtime updates, intelligent autoscaling, health monitoring, and resilience against failures. Apply these patterns when designing workload deployments, planning rollout strategies, or troubleshooting deployment issues.

Existing Repository Compatibility

Before applying deployment patterns, analyze existing deployment strategies:

Rollout Configuration - Review existing deployment strategies (RollingUpdate vs Recreate, maxSurge, maxUnavailable)
Probe Configuration - Identify established probe patterns (HTTP endpoints, timing values, failure thresholds)
Autoscaling Setup - Check existing HPA/VPA configurations and scaling metrics
Disruption Budgets - Review PDB policies and availability requirements
Update History - Examine past rollout issues and lessons learned
Monitoring Integration - Understand how deployments integrate with observability stack
Graceful Shutdown - Check existing preStop hooks and termination grace periods

When refactoring deployments, preserve patterns that work well in your environment. Validate changes in staging before applying to production. Document the reasoning behind deviation from existing patterns.

Rollout Strategies

Choose the appropriate deployment strategy based on workload characteristics and downtime tolerance.

RollingUpdate Strategy

Default strategy for zero-downtime deployments.

# CORRECT: Zero-downtime rolling update
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
  namespace: production
spec:
  replicas: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25% # Can have 12-13 pods during update (10 + 2.5)
      maxUnavailable: 0 # Never reduce below 10 healthy pods
  selector:
    matchLabels:
      app.kubernetes.io/name: web-api
  template:
    metadata:
      labels:
        app.kubernetes.io/name: web-api
        app.kubernetes.io/version: '2.1.0'
    spec:
      containers:
        - name: api
          image: web-api:2.1.0
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 3

# CORRECT: Conservative rolling update for critical service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: payment-service
spec:
  replicas: 20
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1 # Add only 1 pod at a time
      maxUnavailable: 0 # Keep all 20 pods available
  minReadySeconds: 30 # Wait 30s after ready before considering healthy
  progressDeadlineSeconds: 600 # Fail deployment after 10 minutes

# WRONG: Aggressive settings risk downtime
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 100%
      maxUnavailable: 50% # Half your pods can be down - risky!

RollingUpdate Parameter Guide

Parameter	Recommended	Use Case	Risk
maxSurge: 25%, maxUnavailable: 0	General use	Most services	Low
maxSurge: 1, maxUnavailable: 0	Critical services	Payment, auth	Lowest
maxSurge: 50%, maxUnavailable: 0	Fast rollout	Non-critical	Medium
maxSurge: 0, maxUnavailable: 25%	Resource-constrained	Limited capacity	Medium

Recreate Strategy

Terminate all pods before creating new ones. Use only when required.

# CORRECT: Recreate for exclusive resource access
apiVersion: apps/v1
kind: Deployment
metadata:
  name: database-migration
spec:
  replicas: 1
  strategy:
    type: Recreate # Ensures only one version runs at a time
  template:
    spec:
      containers:
        - name: migration
          image: db-migration:1.5.0

When to use Recreate

Database migrations requiring exclusive access
Stateful workloads without multi-version support
Development environments where downtime is acceptable
Applications that cannot run multiple versions simultaneously

Never use Recreate for

Production stateless services
High-availability requirements
Customer-facing applications

Rollout Management

# Monitor rollout progress
kubectl rollout status deployment/web-api -n production

# View rollout history
kubectl rollout history deployment/web-api -n production

# View specific revision
kubectl rollout history deployment/web-api -n production --revision=3

# Pause rollout (stop mid-deployment)
kubectl rollout pause deployment/web-api -n production

# Resume paused rollout
kubectl rollout resume deployment/web-api -n production

# Rollback to previous version
kubectl rollout undo deployment/web-api -n production

# Rollback to specific revision
kubectl rollout undo deployment/web-api -n production --to-revision=2

# Restart deployment (rolling restart)
kubectl rollout restart deployment/web-api -n production

Health Probes

Configure probes to detect failures and manage traffic appropriately.

Probe Types and Purposes

Probe Type	Purpose	Failure Action	Use When
Liveness	Detect deadlocks/crashes	Restart container	App can hang
Readiness	Detect startup/overload	Remove from service	Slow startup or traffic sensitive
Startup	Handle slow initialization	Delay other probes	Very slow startup (>1 min)

Liveness Probe

Detects containers that are running but unhealthy (deadlocked, crashed internal state).

# CORRECT: HTTP liveness probe
apiVersion: v1
kind: Pod
metadata:
  name: web-api
spec:
  containers:
    - name: api
      image: web-api:1.0.0
      ports:
        - containerPort: 8080
      livenessProbe:
        httpGet:
          path: /health/live
          port: 8080
          httpHeaders:
            - name: X-Health-Check
              value: liveness
        initialDelaySeconds: 60 # Wait for app to start
        periodSeconds: 10 # Check every 10 seconds
        timeoutSeconds: 5 # 5 second timeout
        failureThreshold: 3 # Restart after 3 consecutive failures
        successThreshold: 1 # Back to healthy after 1 success

# CORRECT: TCP liveness probe for non-HTTP services
apiVersion: v1
kind: Pod
metadata:
  name: redis
spec:
  containers:
    - name: redis
      image: redis:7
      ports:
        - containerPort: 6379
      livenessProbe:
        tcpSocket:
          port: 6379
        initialDelaySeconds: 30
        periodSeconds: 10
        timeoutSeconds: 5
        failureThreshold: 3

# CORRECT: Exec liveness probe with custom script
apiVersion: v1
kind: Pod
metadata:
  name: custom-app
spec:
  containers:
    - name: app
      image: custom-app:1.0.0
      livenessProbe:
        exec:
          command:
            - /bin/sh
            - -c
            - /app/healthcheck.sh --liveness
        initialDelaySeconds: 45
        periodSeconds: 15
        timeoutSeconds: 10
        failureThreshold: 3

# WRONG: Aggressive liveness probe causes restart loops
livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 5 # Too short, app not ready
  periodSeconds: 5
  timeoutSeconds: 1 # Too short, false positives
  failureThreshold: 1 # Too aggressive, restarts on transient issues

Readiness Probe

Determines when container can receive traffic. Failed readiness removes pod from service endpoints.

# CORRECT: Readiness probe for gradual startup
apiVersion: v1
kind: Pod
metadata:
  name: web-api
spec:
  containers:
    - name: api
      image: web-api:1.0.0
      ports:
        - containerPort: 8080
      readinessProbe:
        httpGet:
          path: /health/ready
          port: 8080
        initialDelaySeconds: 10 # Start checking after 10s
        periodSeconds: 5 # Check frequently during startup
        timeoutSeconds: 3
        failureThreshold: 3
        successThreshold: 1

Readiness endpoint should check

Application initialization complete
Database connections established
Required dependencies available
Cache warmed up (if applicable)
Not overloaded (can accept more traffic)

Readiness endpoint should NOT check

External service availability (unless required for every request)
Downstream service health (unless critical dependency)

# CORRECT: Readiness during overload conditions
# Application code in /health/ready endpoint
# Returns 503 when queue depth > threshold
apiVersion: v1
kind: Pod
metadata:
  name: worker
spec:
  containers:
    - name: worker
      image: worker:1.0.0
      readinessProbe:
        httpGet:
          path: /health/ready # Returns 503 if queue > 1000
          port: 8080
        periodSeconds: 5
        failureThreshold: 2 # Fail quickly when overloaded
        successThreshold: 2 # Require stability before returning

Startup Probe

Delays liveness/readiness checks for slow-starting containers.

# CORRECT: Startup probe for slow initialization
apiVersion: v1
kind: Pod
metadata:
  name: java-app
spec:
  containers:
    - name: app
      image: java-app:1.0.0
      ports:
        - containerPort: 8080
      startupProbe:
        httpGet:
          path: /health/startup
          port: 8080
        initialDelaySeconds: 0
        periodSeconds: 10
        failureThreshold: 30 # 30 * 10s = 5 minutes max startup time
        successThreshold: 1
      livenessProbe:
        httpGet:
          path: /health/live
          port: 8080
        periodSeconds: 10
        failureThreshold: 3
      readinessProbe:
        httpGet:
          path: /health/ready
          port: 8080
        periodSeconds: 5
        failureThreshold: 3

Startup probe flow

Container starts
Startup probe runs (liveness/readiness disabled)
Once startup succeeds, liveness/readiness probes begin
If startup never succeeds, container restarts after failureThreshold

Probe Timing Guidelines

Application Type	initialDelaySeconds	periodSeconds	timeoutSeconds	failureThreshold
Fast startup (Node.js, Go)	10-15	10	5	3
Medium startup (Python, Ruby)	30-45	10	5	3
Slow startup (Java, .NET)	Use startup probe	10	5	3
Database	30-60	10	5	3

Common Probe Pitfalls

# WRONG: Liveness and readiness check different endpoints
livenessProbe:
  httpGet:
    path: /health/live # Checks app liveness
readinessProbe:
  httpGet:
    path: /metrics # Wrong - checks Prometheus endpoint

# WRONG: Readiness checks external dependencies
# GET /health/ready implementation
# if (!canConnectToPaymentGateway()) return 503
# This removes pod from rotation when external service is down
# Pod should stay ready and return errors to clients instead

# WRONG: Too aggressive timing causes flapping
readinessProbe:
  httpGet:
    path: /health/ready
  periodSeconds: 1 # Too frequent, wastes CPU
  failureThreshold: 1 # Removes from service on first blip
  successThreshold: 1 # Returns immediately, causes flapping

Autoscaling

Automatically adjust replica count based on resource utilization or custom metrics.

Horizontal Pod Autoscaler (HPA)

Scale number of pods based on metrics.

# CORRECT: HPA based on CPU and memory
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-api-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-api
  minReplicas: 5
  maxReplicas: 50
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70 # Scale when average CPU > 70%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80 # Scale when average memory > 80%
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300 # Wait 5 minutes before scaling down
      policies:
        - type: Percent
          value: 50 # Scale down max 50% of pods
          periodSeconds: 60 # Every 60 seconds
        - type: Pods
          value: 2 # Or max 2 pods
          periodSeconds: 60 # Every 60 seconds
      selectPolicy: Min # Use most conservative policy
    scaleUp:
      stabilizationWindowSeconds: 0 # Scale up immediately
      policies:
        - type: Percent
          value: 100 # Scale up max 100% of pods (double)
          periodSeconds: 15 # Every 15 seconds
        - type: Pods
          value: 4 # Or max 4 pods
          periodSeconds: 15 # Every 15 seconds
      selectPolicy: Max # Use most aggressive policy

# CORRECT: HPA based on custom metrics (requests per second)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-api-hpa-custom
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-api
  minReplicas: 3
  maxReplicas: 30
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: '1000' # Scale when RPS > 1000 per pod

# CORRECT: HPA with external metrics (SQS queue depth)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: worker-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: worker
  minReplicas: 2
  maxReplicas: 100
  metrics:
    - type: External
      external:
        metric:
          name: sqs_queue_depth
          selector:
            matchLabels:
              queue: processing
        target:
          type: AverageValue
          averageValue: '30' # 30 messages per pod

HPA Best Practices

Set Conservative Targets - 70-80% utilization gives headroom for spikes
Prevent Thrashing - Use stabilizationWindowSeconds for scale-down
Gradual Scaling - Limit scale-up/down rates with behavior policies
Resource Requests Required - HPA needs accurate resource requests to calculate utilization
Monitor HPA Decisions - Check HPA events and metrics

# View HPA status
kubectl get hpa -n production

# Describe HPA to see events and decisions
kubectl describe hpa web-api-hpa -n production

# View HPA metrics
kubectl get hpa web-api-hpa -n production -o yaml

# Test HPA by generating load
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://web-api; done"

Vertical Pod Autoscaler (VPA)

Adjust resource requests/limits based on actual usage.

# CORRECT: VPA in recommend-only mode
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: web-api-vpa
  namespace: production
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-api
  updateMode: 'Off' # Recommend only, don't auto-apply
  resourcePolicy:
    containerPolicies:
      - containerName: api
        minAllowed:
          cpu: 100m
          memory: 128Mi
        maxAllowed:
          cpu: 4000m
          memory: 8Gi
        controlledResources:
          - cpu
          - memory

# CORRECT: VPA in auto mode for non-critical workload
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: background-worker-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: background-worker
  updateMode: 'Auto' # Automatically apply recommendations
  updatePolicy:
    updateMode: 'Auto'

VPA Update Modes

Mode	Behavior	Use Case
Off	Recommend only	Production (review before applying)
Initial	Set on pod creation only	New deployments
Recreate	Update by recreating pods	Non-critical workloads
Auto	Update by recreating pods	Development environments

Important VPA Limitations

Do NOT use HPA and VPA on the same metric (CPU/memory)
VPA requires pod restarts to apply changes
Recommendations take 24-48 hours to stabilize

KEDA (Event-Driven Autoscaling)

Scale based on event sources (queues, streams, schedules).

# CORRECT: KEDA ScaledObject for RabbitMQ queue
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-consumer-scaler
  namespace: production
spec:
  scaleTargetRef:
    name: rabbitmq-consumer
  minReplicaCount: 2
  maxReplicaCount: 30
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
    - type: rabbitmq
      metadata:
        queueName: processing
        mode: QueueLength
        value: '20' # Maintain 20 messages per pod
        host: amqp://rabbitmq:5672

# CORRECT: KEDA ScaledObject for scheduled scaling
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: scheduled-scaler
spec:
  scaleTargetRef:
    name: web-api
  minReplicaCount: 5
  maxReplicaCount: 50
  triggers:
    - type: cron
      metadata:
        timezone: America/New_York
        start: 0 8 * * 1-5 # Scale up Mon-Fri at 8 AM
        end: 0 18 * * 1-5 # Scale down Mon-Fri at 6 PM
        desiredReplicas: '20'

Pod Disruption Budgets

Protect availability during voluntary disruptions (node drains, cluster upgrades).

# CORRECT: PDB ensuring minimum availability
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-api-pdb
  namespace: production
spec:
  minAvailable: 3 # Always keep at least 3 pods running
  selector:
    matchLabels:
      app.kubernetes.io/name: web-api

# CORRECT: PDB with percentage
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-api-pdb
spec:
  minAvailable: 80% # Keep at least 80% of pods available
  selector:
    matchLabels:
      app.kubernetes.io/name: web-api

# CORRECT: PDB with maxUnavailable
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-api-pdb
spec:
  maxUnavailable: 1 # Allow max 1 pod to be unavailable
  selector:
    matchLabels:
      app.kubernetes.io/name: web-api

PDB Selection Guidelines

Scenario	Recommendation	Reasoning
High-availability service	minAvailable: N-1 or 80%	Maintain capacity during disruptions
Single replica	Don't use PDB	Would block all disruptions
2-3 replicas	maxUnavailable: 1	Allow disruptions while maintaining service
Large deployment (10+)	minAvailable: 75-80%	Balance availability and disruption speed

# WRONG: PDB blocks all disruptions
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: bad-pdb
spec:
  minAvailable: 3
  selector:
    matchLabels:
      app: myapp
# Deployment only has 3 replicas - PDB prevents any voluntary disruptions!

Topology Spread Constraints

Distribute pods across zones and nodes for resilience.

# CORRECT: Spread across availability zones
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
spec:
  replicas: 9
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: web-api
        - maxSkew: 2
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: web-api

Topology Spread Parameters

maxSkew - Maximum difference in pod count between topology domains
topologyKey - Node label key defining domains (zone, hostname, etc.)
whenUnsatisfiable - DoNotSchedule (hard) or ScheduleAnyway (soft)
labelSelector - Which pods to consider for spreading

Pod Anti-Affinity

Avoid scheduling pods on same node/zone.

# CORRECT: Prefer different nodes, require different zones
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
spec:
  template:
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchLabels:
                  app.kubernetes.io/name: web-api
              topologyKey: topology.kubernetes.io/zone
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app.kubernetes.io/name: web-api
                topologyKey: kubernetes.io/hostname

Zero-Downtime Deployments

Achieve true zero-downtime with proper graceful shutdown and load balancer integration.

Graceful Shutdown Pattern

# CORRECT: Complete zero-downtime configuration
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api
spec:
  replicas: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 0
  template:
    spec:
      containers:
        - name: api
          image: web-api:2.0.0
          ports:
            - containerPort: 8080
              name: http
          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - -c
                  - sleep 15 # Wait for load balancer to deregister
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
            failureThreshold: 1 # Fail immediately on shutdown
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 10
      terminationGracePeriodSeconds: 60

Zero-Downtime Deployment Flow

New Pod Created - Scheduler assigns to node
Pod Starting - Container starts, startup probe runs
Pod Ready - Readiness succeeds, added to service endpoints
Load Balancer Updated - Takes 5-15 seconds depending on provider
Old Pod SIGTERM - Kubernetes sends termination signal
PreStop Hook - Sleep to wait for LB deregistration
Readiness Fails - App stops accepting new requests, removed from endpoints
Existing Requests Finish - App completes in-flight requests
SIGKILL - After terminationGracePeriodSeconds if still running

Application Code for Graceful Shutdown

// CORRECT: Go application graceful shutdown
package main

import (
    "context"
    "net/http"
    "os"
    "os/signal"
    "syscall"
    "time"
)

func main() {
    srv := &http.Server{Addr: ":8080"}

    // Start server
    go func() {
        if err := srv.ListenAndServe(); err != http.ErrServerClosed {
            log.Fatal(err)
        }
    }()

    // Wait for SIGTERM
    quit := make(chan os.Signal, 1)
    signal.Notify(quit, syscall.SIGTERM, syscall.SIGINT)
    <-quit

    // Fail readiness immediately
    readinessProbe.Fail()

    // Graceful shutdown with 30s timeout
    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()

    if err := srv.Shutdown(ctx); err != nil {
        log.Fatal(err)
    }
}

Timing Configuration

Setting	Value	Purpose
preStop sleep	15s	Wait for LB deregistration
terminationGracePeriodSeconds	60s	Time to finish requests
readiness failureThreshold	1	Remove from service quickly
Max request duration	< 45s	Must complete before SIGKILL

# WRONG: Missing graceful shutdown
spec:
  terminationGracePeriodSeconds: 30
  containers:
    - name: api
      # No preStop hook - app receives traffic during shutdown
      # No readiness probe - stays in endpoints during shutdown
      # Requests are dropped mid-flight

Progressive Delivery

Advanced deployment patterns for reduced risk.

Blue-Green Deployment

# Blue deployment (current production)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api-blue
  labels:
    version: blue
spec:
  replicas: 10
  selector:
    matchLabels:
      app: web-api
      version: blue
  template:
    metadata:
      labels:
        app: web-api
        version: blue
    spec:
      containers:
        - name: api
          image: web-api:1.0.0

---
# Green deployment (new version)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-api-green
  labels:
    version: green
spec:
  replicas: 10
  selector:
    matchLabels:
      app: web-api
      version: green
  template:
    metadata:
      labels:
        app: web-api
        version: green
    spec:
      containers:
        - name: api
          image: web-api:2.0.0

---
# Service (switch by changing selector)
apiVersion: v1
kind: Service
metadata:
  name: web-api
spec:
  selector:
    app: web-api
    version: blue # Change to 'green' to switch versions
  ports:
    - port: 80
      targetPort: 8080

Canary Deployment

Use service mesh or ingress weights for gradual rollout.

# 90% traffic to stable, 10% to canary
apiVersion: v1
kind: Service
metadata:
  name: web-api-stable
spec:
  selector:
    app: web-api
    version: stable
  ports:
    - port: 80
      targetPort: 8080

---
apiVersion: v1
kind: Service
metadata:
  name: web-api-canary
spec:
  selector:
    app: web-api
    version: canary
  ports:
    - port: 80
      targetPort: 8080

---
# Nginx Ingress with traffic splitting
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-api
  annotations:
    nginx.ingress.kubernetes.io/canary: 'true'
    nginx.ingress.kubernetes.io/canary-weight: '10'
spec:
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web-api-canary
                port:
                  number: 80

Deployment Checklist

Before deploying to production, verify:

Apply these patterns consistently to achieve reliable, zero-downtime deployments.