Skill

castai-core-workflow-b

Configures CAST AI Workload Autoscaler for Kubernetes pod right-sizing, VPA, resource recommendations, and CPU/memory tuning via annotations and API.

Kubernetes

Bash

devops

infrastructure

Popularity

Parent stars

2,526

Parent forks

365

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/castai-pack:castai-core-workflow-b

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadWriteEditBash(curl:*)Bash(kubectl:*)Grep

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

CAST AI Workload Autoscaler right-sizes pod resource requests based on actual usage, reducing over-provisioning without manual VPA tuning. This skill covers enabling the workload autoscaler, configuring scaling policies per workload, and using annotations for fine-grained control.

SKILL.md

160 lines · ~1.2k tokens

Stats

LanguagePython

Parent stars2,526

Parent forks365

MaintenanceGood

Last CommitJul 17, 2026

Actions

View Source View Plugin View on GitHub View README

CAST AI Core Workflow: Workload Autoscaler

Overview

Prerequisites

Completed castai-core-workflow-a (cluster-level policies)
CAST AI agent v1.60+ installed
Workload Autoscaler enabled in CAST AI console

Instructions

Step 1: Install Workload Autoscaler Components

helm upgrade --install castai-workload-autoscaler \
  castai-helm/castai-workload-autoscaler \
  -n castai-agent \
  --set castai.apiKey="${CASTAI_API_KEY}" \
  --set castai.clusterID="${CASTAI_CLUSTER_ID}"

Step 2: Query Workload Recommendations

# Get resource recommendations for a specific workload
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \
  | jq '.items[] | {
    name: .workloadName,
    namespace: .namespace,
    currentCpu: .currentCpuRequest,
    recommendedCpu: .recommendedCpuRequest,
    currentMemory: .currentMemoryRequest,
    recommendedMemory: .recommendedMemoryRequest,
    savingsPercent: .estimatedSavingsPercent
  }'

Step 3: Configure Per-Workload Policies via Annotations

# Add annotations to deployments for CAST AI workload autoscaler
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
  annotations:
    # Enable workload autoscaling
    autoscaling.cast.ai/enabled: "true"
    # CPU configuration
    autoscaling.cast.ai/cpu-min: "100m"
    autoscaling.cast.ai/cpu-max: "4000m"
    autoscaling.cast.ai/cpu-headroom: "15"
    # Memory configuration
    autoscaling.cast.ai/memory-min: "128Mi"
    autoscaling.cast.ai/memory-max: "8Gi"
    autoscaling.cast.ai/memory-headroom: "20"
    # Apply changes automatically vs recommendation-only
    autoscaling.cast.ai/apply-type: "immediate"
spec:
  template:
    spec:
      containers:
        - name: api
          resources:
            requests:
              cpu: "500m"      # Will be auto-adjusted by CAST AI
              memory: "512Mi"  # Will be auto-adjusted by CAST AI

Step 4: Create a Scaling Policy via API

curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \
  -H "Content-Type: application/json" \
  "https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/policies" \
  -d '{
    "name": "cost-optimized",
    "applyType": "IMMEDIATE",
    "management": {
      "cpu": {
        "function": "QUANTILE",
        "args": { "quantile": 0.95 },
        "overhead": 0.15,
        "min": 50,
        "max": 8000
      },
      "memory": {
        "function": "MAX",
        "overhead": 0.20,
        "min": 64,
        "max": 16384
      }
    },
    "antiShrink": {
      "enabled": true,
      "cooldownSeconds": 300
    }
  }'

Step 5: Monitor Workload Scaling Events

# Check scaling events
kubectl get events -n default --field-selector reason=CastAIWorkloadAutoscaled

# View current vs recommended via API
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
  "https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads/${WORKLOAD_ID}" \
  | jq '.scalingEvents[-5:]'

Error Handling

Error	Cause	Solution
Workload not appearing	Missing annotation	Add `autoscaling.cast.ai/enabled: "true"`
OOMKilled after scaling	Memory headroom too low	Increase `memory-headroom` to 25+
CPU throttling	CPU recommendation too aggressive	Increase `cpu-headroom` or set higher min
No recommendations yet	Insufficient data	Wait 24h for usage data collection

Resources

Next Steps

For troubleshooting CAST AI errors, see castai-common-errors.

castai-core-workflow-b

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

castai-core-workflow-b

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

CAST AI Core Workflow: Workload Autoscaler

Overview

Prerequisites

Instructions

Step 1: Install Workload Autoscaler Components

Step 2: Query Workload Recommendations

Step 3: Configure Per-Workload Policies via Annotations

Step 4: Create a Scaling Policy via API

Step 5: Monitor Workload Scaling Events

Error Handling

Resources

Next Steps

Reused across plugins

Similar Skills

CAST AI Core Workflow: Workload Autoscaler

Overview

Prerequisites

Instructions

Step 1: Install Workload Autoscaler Components

Step 2: Query Workload Recommendations

Step 3: Configure Per-Workload Policies via Annotations

Step 4: Create a Scaling Policy via API

Step 5: Monitor Workload Scaling Events

Error Handling

Resources

Next Steps

Reused across plugins

Similar Skills