Kubernetes cost management, resource optimization, and FinOps practices
Optimizes Kubernetes costs through resource right-sizing, intelligent autoscaling, and waste elimination.
npx claudepluginhub pluginagentmarketplace/custom-plugin-kubernetesThis skill inherits all available tools. When active, it can use any tool Claude has access to.
assets/config.yamlassets/schema.jsonreferences/GUIDE.mdreferences/PATTERNS.mdscripts/validate.pyProduction-grade Kubernetes cost management covering resource optimization, autoscaling, and FinOps practices. This skill provides deep expertise in achieving 30-50% cost reduction while maintaining performance and reliability.
Vertical Pod Autoscaler
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-server-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
updatePolicy:
updateMode: "Auto" # or "Off" for recommendations only
resourcePolicy:
containerPolicies:
- containerName: api-server
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 4
memory: 8Gi
controlledResources: ["cpu", "memory"]
Resource Recommendations Analysis
# Get VPA recommendations
kubectl describe vpa api-server-vpa
# Check current vs recommended
kubectl get vpa api-server-vpa -o jsonpath='{.status.recommendation}'
# Goldilocks for all deployments
kubectl apply -f https://github.com/FairwindsOps/goldilocks/releases/latest/download/goldilocks.yaml
kubectl label namespace production goldilocks.fairwinds.com/enabled=true
Kubecost Installation
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost --create-namespace \
--set kubecostToken="YOUR_TOKEN" \
--set prometheus.nodeExporter.enabled=false \
--set prometheus.serviceAccounts.nodeExporter.create=false
Cost Allocation Labels
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
labels:
# Cost allocation labels
team: backend
environment: production
product: ecommerce
cost-center: engineering
spec:
template:
metadata:
labels:
team: backend
cost-center: engineering
HPA with Cost Awareness
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 20
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
KEDA for Event-Driven Scaling
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: api-server
spec:
scaleTargetRef:
name: api-server
minReplicaCount: 0 # Scale to zero!
maxReplicaCount: 50
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus:9090
metricName: http_requests_total
query: sum(rate(http_requests_total{app="api-server"}[1m]))
threshold: "100"
- type: cron
metadata:
timezone: America/New_York
start: 0 8 * * 1-5
end: 0 20 * * 1-5
desiredReplicas: "5"
Mixed Node Pool Strategy
# Spot-tolerant workloads
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
template:
spec:
nodeSelector:
kubernetes.io/capacity-type: spot
tolerations:
- key: kubernetes.io/capacity-type
value: spot
effect: NoSchedule
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: kubernetes.io/capacity-type
operator: In
values:
- spot
Cluster Autoscaler with Mixed Pools
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
spec:
template:
spec:
containers:
- name: cluster-autoscaler
command:
- ./cluster-autoscaler
- --expander=priority
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled
- --balance-similar-node-groups=true
- --skip-nodes-with-local-storage=false
Idle Resource Detection
# Find oversized deployments
kubectl get deployments -A -o json | jq '
.items[] |
select(.spec.replicas > 0) |
{
namespace: .metadata.namespace,
name: .metadata.name,
replicas: .spec.replicas,
cpu_request: .spec.template.spec.containers[0].resources.requests.cpu,
memory_request: .spec.template.spec.containers[0].resources.requests.memory
}
'
# Find unused PVCs
kubectl get pvc -A --no-headers | while read ns name _; do
used=$(kubectl get pods -n $ns -o json | jq --arg pvc "$name" '.items[] | select(.spec.volumes[]?.persistentVolumeClaim.claimName == $pvc)')
[ -z "$used" ] && echo "Unused PVC: $ns/$name"
done
Resource Cleanup Policy
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: cleanup-stale-pods
spec:
rules:
- name: delete-completed-jobs
match:
resources:
kinds:
- Job
preconditions:
all:
- key: "{{ request.object.status.succeeded }}"
operator: Equals
value: 1
- key: "{{ time_since('', '{{ request.object.status.completionTime }}', '') }}"
operator: GreaterThan
value: "24h"
mutate:
patchStrategicMerge:
metadata:
deletionTimestamp: "{{ time_now() }}"
High Costs?
│
├── Over-provisioned
│ ├── Check VPA recommendations
│ ├── Right-size requests
│ └── Enable HPA
│
├── Idle resources
│ ├── Find unused PVCs
│ ├── Check scale-to-zero
│ └── Clean up stale jobs
│
└── Wrong instance types
├── Use spot for batch
├── Review node pools
└── Check reserved coverage
# Cost analysis
kubectl top pods -A --sort-by=cpu
kubectl top pods -A --sort-by=memory
# Resource efficiency
kubectl get pods -A -o json | jq '[.items[].spec.containers[].resources] | add'
# Kubecost API
curl http://kubecost:9090/model/allocation?window=7d&aggregate=namespace
| Challenge | Solution |
|---|---|
| Overprovisioning | VPA, right-sizing |
| Idle resources | Scale-to-zero, cleanup |
| Spot interruptions | PDB, spreading |
| Cost attribution | Labels, Kubecost |
| Metric | Target |
|---|---|
| Cost reduction | 30-50% |
| Resource utilization | >60% |
| Waste identification | <10% idle |
| Budget compliance | 100% |
Search, retrieve, and install Agent Skills from the prompts.chat registry using MCP tools. Use when the user asks to find skills, browse skill catalogs, install a skill for Claude, or extend Claude's capabilities with reusable AI agent components.
Activates when the user asks about AI prompts, needs prompt templates, wants to search for prompts, or mentions prompts.chat. Use for discovering, retrieving, and improving prompts.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.