From kubernetes-assistant
Kubernetes cost management, resource optimization, and FinOps practices
npx claudepluginhub pluginagentmarketplace/custom-plugin-kubernetes --plugin kubernetes-assistantThis skill uses the workspace's default tool permissions.
Production-grade Kubernetes cost management covering resource optimization, autoscaling, and FinOps practices. This skill provides deep expertise in achieving 30-50% cost reduction while maintaining performance and reliability.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Production-grade Kubernetes cost management covering resource optimization, autoscaling, and FinOps practices. This skill provides deep expertise in achieving 30-50% cost reduction while maintaining performance and reliability.
Vertical Pod Autoscaler
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-server-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
updatePolicy:
updateMode: "Auto" # or "Off" for recommendations only
resourcePolicy:
containerPolicies:
- containerName: api-server
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 4
memory: 8Gi
controlledResources: ["cpu", "memory"]
Resource Recommendations Analysis
# Get VPA recommendations
kubectl describe vpa api-server-vpa
# Check current vs recommended
kubectl get vpa api-server-vpa -o jsonpath='{.status.recommendation}'
# Goldilocks for all deployments
kubectl apply -f https://github.com/FairwindsOps/goldilocks/releases/latest/download/goldilocks.yaml
kubectl label namespace production goldilocks.fairwinds.com/enabled=true
Kubecost Installation
helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm install kubecost kubecost/cost-analyzer \
--namespace kubecost --create-namespace \
--set kubecostToken="YOUR_TOKEN" \
--set prometheus.nodeExporter.enabled=false \
--set prometheus.serviceAccounts.nodeExporter.create=false
Cost Allocation Labels
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
labels:
# Cost allocation labels
team: backend
environment: production
product: ecommerce
cost-center: engineering
spec:
template:
metadata:
labels:
team: backend
cost-center: engineering
HPA with Cost Awareness
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 20
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
KEDA for Event-Driven Scaling
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: api-server
spec:
scaleTargetRef:
name: api-server
minReplicaCount: 0 # Scale to zero!
maxReplicaCount: 50
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus:9090
metricName: http_requests_total
query: sum(rate(http_requests_total{app="api-server"}[1m]))
threshold: "100"
- type: cron
metadata:
timezone: America/New_York
start: 0 8 * * 1-5
end: 0 20 * * 1-5
desiredReplicas: "5"
Mixed Node Pool Strategy
# Spot-tolerant workloads
apiVersion: apps/v1
kind: Deployment
metadata:
name: batch-processor
spec:
template:
spec:
nodeSelector:
kubernetes.io/capacity-type: spot
tolerations:
- key: kubernetes.io/capacity-type
value: spot
effect: NoSchedule
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: kubernetes.io/capacity-type
operator: In
values:
- spot
Cluster Autoscaler with Mixed Pools
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
spec:
template:
spec:
containers:
- name: cluster-autoscaler
command:
- ./cluster-autoscaler
- --expander=priority
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled
- --balance-similar-node-groups=true
- --skip-nodes-with-local-storage=false
Idle Resource Detection
# Find oversized deployments
kubectl get deployments -A -o json | jq '
.items[] |
select(.spec.replicas > 0) |
{
namespace: .metadata.namespace,
name: .metadata.name,
replicas: .spec.replicas,
cpu_request: .spec.template.spec.containers[0].resources.requests.cpu,
memory_request: .spec.template.spec.containers[0].resources.requests.memory
}
'
# Find unused PVCs
kubectl get pvc -A --no-headers | while read ns name _; do
used=$(kubectl get pods -n $ns -o json | jq --arg pvc "$name" '.items[] | select(.spec.volumes[]?.persistentVolumeClaim.claimName == $pvc)')
[ -z "$used" ] && echo "Unused PVC: $ns/$name"
done
Resource Cleanup Policy
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: cleanup-stale-pods
spec:
rules:
- name: delete-completed-jobs
match:
resources:
kinds:
- Job
preconditions:
all:
- key: "{{ request.object.status.succeeded }}"
operator: Equals
value: 1
- key: "{{ time_since('', '{{ request.object.status.completionTime }}', '') }}"
operator: GreaterThan
value: "24h"
mutate:
patchStrategicMerge:
metadata:
deletionTimestamp: "{{ time_now() }}"
High Costs?
│
├── Over-provisioned
│ ├── Check VPA recommendations
│ ├── Right-size requests
│ └── Enable HPA
│
├── Idle resources
│ ├── Find unused PVCs
│ ├── Check scale-to-zero
│ └── Clean up stale jobs
│
└── Wrong instance types
├── Use spot for batch
├── Review node pools
└── Check reserved coverage
# Cost analysis
kubectl top pods -A --sort-by=cpu
kubectl top pods -A --sort-by=memory
# Resource efficiency
kubectl get pods -A -o json | jq '[.items[].spec.containers[].resources] | add'
# Kubecost API
curl http://kubecost:9090/model/allocation?window=7d&aggregate=namespace
| Challenge | Solution |
|---|---|
| Overprovisioning | VPA, right-sizing |
| Idle resources | Scale-to-zero, cleanup |
| Spot interruptions | PDB, spreading |
| Cost attribution | Labels, Kubecost |
| Metric | Target |
|---|---|
| Cost reduction | 30-50% |
| Resource utilization | >60% |
| Waste identification | <10% idle |
| Budget compliance | 100% |