Kubernetes Cost Optimization

Executive Summary

Production-grade Kubernetes cost management covering resource optimization, autoscaling, and FinOps practices. This skill provides deep expertise in achieving 30-50% cost reduction while maintaining performance and reliability.

Core Competencies

1. Resource Right-Sizing

Vertical Pod Autoscaler

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-server-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  updatePolicy:
    updateMode: "Auto"  # or "Off" for recommendations only
  resourcePolicy:
    containerPolicies:
    - containerName: api-server
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 4
        memory: 8Gi
      controlledResources: ["cpu", "memory"]

Resource Recommendations Analysis

# Get VPA recommendations
kubectl describe vpa api-server-vpa

# Check current vs recommended
kubectl get vpa api-server-vpa -o jsonpath='{.status.recommendation}'

# Goldilocks for all deployments
kubectl apply -f https://github.com/FairwindsOps/goldilocks/releases/latest/download/goldilocks.yaml
kubectl label namespace production goldilocks.fairwinds.com/enabled=true

2. Cost Visibility

Kubecost Installation

helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
  --namespace kubecost --create-namespace \
  --set kubecostToken="YOUR_TOKEN" \
  --set prometheus.nodeExporter.enabled=false \
  --set prometheus.serviceAccounts.nodeExporter.create=false

Cost Allocation Labels

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
  labels:
    # Cost allocation labels
    team: backend
    environment: production
    product: ecommerce
    cost-center: engineering
spec:
  template:
    metadata:
      labels:
        team: backend
        cost-center: engineering

3. Intelligent Autoscaling

HPA with Cost Awareness

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-server
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 2
  maxReplicas: 20
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

KEDA for Event-Driven Scaling

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: api-server
spec:
  scaleTargetRef:
    name: api-server
  minReplicaCount: 0  # Scale to zero!
  maxReplicaCount: 50
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      metricName: http_requests_total
      query: sum(rate(http_requests_total{app="api-server"}[1m]))
      threshold: "100"
  - type: cron
    metadata:
      timezone: America/New_York
      start: 0 8 * * 1-5
      end: 0 20 * * 1-5
      desiredReplicas: "5"

4. Spot/Preemptible Nodes

Mixed Node Pool Strategy

# Spot-tolerant workloads
apiVersion: apps/v1
kind: Deployment
metadata:
  name: batch-processor
spec:
  template:
    spec:
      nodeSelector:
        kubernetes.io/capacity-type: spot
      tolerations:
      - key: kubernetes.io/capacity-type
        value: spot
        effect: NoSchedule
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            preference:
              matchExpressions:
              - key: kubernetes.io/capacity-type
                operator: In
                values:
                - spot

Cluster Autoscaler with Mixed Pools

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
spec:
  template:
    spec:
      containers:
      - name: cluster-autoscaler
        command:
        - ./cluster-autoscaler
        - --expander=priority
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled
        - --balance-similar-node-groups=true
        - --skip-nodes-with-local-storage=false

5. Waste Elimination

Idle Resource Detection

# Find oversized deployments
kubectl get deployments -A -o json | jq '
  .items[] |
  select(.spec.replicas > 0) |
  {
    namespace: .metadata.namespace,
    name: .metadata.name,
    replicas: .spec.replicas,
    cpu_request: .spec.template.spec.containers[0].resources.requests.cpu,
    memory_request: .spec.template.spec.containers[0].resources.requests.memory
  }
'

# Find unused PVCs
kubectl get pvc -A --no-headers | while read ns name _; do
  used=$(kubectl get pods -n $ns -o json | jq --arg pvc "$name" '.items[] | select(.spec.volumes[]?.persistentVolumeClaim.claimName == $pvc)')
  [ -z "$used" ] && echo "Unused PVC: $ns/$name"
done

Resource Cleanup Policy

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: cleanup-stale-pods
spec:
  rules:
  - name: delete-completed-jobs
    match:
      resources:
        kinds:
        - Job
    preconditions:
      all:
      - key: "{{ request.object.status.succeeded }}"
        operator: Equals
        value: 1
      - key: "{{ time_since('', '{{ request.object.status.completionTime }}', '') }}"
        operator: GreaterThan
        value: "24h"
    mutate:
      patchStrategicMerge:
        metadata:
          deletionTimestamp: "{{ time_now() }}"

Integration Patterns

Uses skill: cluster-admin

Node pool management
Cluster autoscaling

Coordinates with skill: monitoring

Resource metrics
Cost dashboards

Works with skill: deployments

HPA configuration
Resource requests

Troubleshooting Guide

Decision Tree: Cost Issues

High Costs?
│
├── Over-provisioned
│   ├── Check VPA recommendations
│   ├── Right-size requests
│   └── Enable HPA
│
├── Idle resources
│   ├── Find unused PVCs
│   ├── Check scale-to-zero
│   └── Clean up stale jobs
│
└── Wrong instance types
    ├── Use spot for batch
    ├── Review node pools
    └── Check reserved coverage

Debug Commands

# Cost analysis
kubectl top pods -A --sort-by=cpu
kubectl top pods -A --sort-by=memory

# Resource efficiency
kubectl get pods -A -o json | jq '[.items[].spec.containers[].resources] | add'

# Kubecost API
curl http://kubecost:9090/model/allocation?window=7d&aggregate=namespace

Common Challenges & Solutions

Challenge	Solution
Overprovisioning	VPA, right-sizing
Idle resources	Scale-to-zero, cleanup
Spot interruptions	PDB, spreading
Cost attribution	Labels, Kubecost

Success Criteria

Metric	Target
Cost reduction	30-50%
Resource utilization	>60%
Waste identification	<10% idle
Budget compliance	100%

cost-optimization