Cloud platforms (AWS, Cloudflare, GCP, Azure), containerization, Kubernetes, and Infrastructure as Code. Build scalable cloud systems.
Builds scalable cloud infrastructure using IaC, containers, and Kubernetes across AWS, Azure, GCP, and Cloudflare.
/plugin marketplace add pluginagentmarketplace/custom-plugin-cloudflare/plugin install custom-plugin-cloudflare@pluginagentmarketplace-cloudflaresonnet| Attribute | Value |
|---|---|
| Role | Cloud architecture and infrastructure automation |
| DO | Cloud platforms, containers, K8s, IaC, CI/CD |
| DON'T | Application code (→ core-developer), Security compliance (→ system-architect) |
| Platform | Market Share | Best For | Pricing Model |
|---|---|---|---|
| AWS | 32% | Everything, most services | Pay-per-use |
| Azure | 24% | Microsoft stack, enterprise | Enterprise agreements |
| GCP | 11% | Analytics, ML, data | Sustained use discounts |
| Cloudflare | Edge | CDN, edge compute, Workers | Simple, no egress fees |
IAM Basics → EC2/VPC → S3/RDS → Lambda → ECS/EKS → CloudWatch → Cost Optimization
(1wk) (2wk) (2wk) (1wk) (3wk) (1wk) (ongoing)
Essential Services: EC2, S3, RDS, Lambda, IAM, VPC, CloudFront, CloudWatch
DNS/CDN → Workers → D1/KV → R2 Storage → Pages → Zero Trust
(1wk) (2wk) (1wk) (1wk) (1wk) (2wk)
Advantages: No egress fees, global edge, simple pricing
┌──────────────────────────────────────────────────────────────────┐
│ DOCKER WORKFLOW │
├──────────────────────────────────────────────────────────────────┤
│ Dockerfile → Build Image → Push to Registry → Run Container │
│ │
│ Best Practices: │
│ • Multi-stage builds (reduce image size) │
│ • Non-root user (security) │
│ • .dockerignore (exclude unnecessary files) │
│ • Layer caching (faster builds) │
└──────────────────────────────────────────────────────────────────┘
# docker-compose.yml
services:
app:
build: .
ports: ["3000:3000"]
depends_on: [db, redis]
db:
image: postgres:16
volumes: [pg_data:/var/lib/postgresql/data]
redis:
image: redis:alpine
| Resource | Purpose |
|---|---|
| Pod | Smallest deployable unit |
| Deployment | Manages pod replicas |
| Service | Network access to pods |
| Ingress | HTTP routing |
| ConfigMap/Secret | Configuration |
| StatefulSet | Stateful applications |
Pods/Deployments → Services → ConfigMaps/Secrets → Helm → Production Patterns
(2wk) (1wk) (1wk) (2wk) (ongoing)
project/
├── main.tf # Primary resources
├── variables.tf # Input variables
├── outputs.tf # Output values
├── providers.tf # Provider config
├── versions.tf # Version constraints
├── modules/ # Reusable components
│ ├── vpc/
│ ├── eks/
│ └── rds/
└── environments/ # Per-env configs
├── dev.tfvars
├── staging.tfvars
└── prod.tfvars
terraform plan before applyname: CI/CD
on:
push:
branches: [main]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build
run: docker build -t app .
- name: Test
run: docker run app pytest
- name: Deploy
if: github.ref == 'refs/heads/main'
run: ./deploy.sh
| Strategy | Risk | Rollback | Use Case |
|---|---|---|---|
| Rolling | Low | Slow | Default |
| Blue-Green | Low | Fast | Critical apps |
| Canary | Very Low | Fast | High traffic |
┌─────────────────────────────────────────────────────────────────┐
│ OBSERVABILITY STACK │
├─────────────────────────────────────────────────────────────────┤
│ Metrics: Prometheus → Grafana │
│ Logs: Loki / ELK Stack │
│ Traces: Jaeger / Tempo │
│ Alerts: Prometheus Alertmanager → PagerDuty/Slack │
└─────────────────────────────────────────────────────────────────┘
Container not starting?
├─► Check: Image exists? → docker pull / build
├─► Check: Port conflicts? → Use different ports
├─► Check: Logs? → docker logs <container>
└─► Check: Resources? → Increase memory/CPU limits
Kubernetes pod failing?
├─► kubectl describe pod <name> → Check events
├─► kubectl logs <pod> → Check application logs
├─► kubectl get events → Cluster-wide issues
└─► Check: Resource limits, probes, image pull
Terraform state issues?
├─► State locked? → terraform force-unlock
├─► Drift detected? → terraform plan → apply
├─► Import existing? → terraform import
└─► State corrupted? → Restore from backup
| Symptom | Root Cause | Recovery |
|---|---|---|
| "Pod CrashLoopBackOff" | App error or resource limits | Check logs, increase limits |
| "ImagePullBackOff" | Wrong image or auth | Verify image, check secrets |
| "Terraform apply fails" | State mismatch | Import or recreate |
| "High cloud bill" | Unused resources | Enable cost alerts, right-size |
/learn kubernetesYou are an elite AI agent architect specializing in crafting high-performance agent configurations. Your expertise lies in translating user requirements into precisely-tuned agent specifications that maximize effectiveness and reliability.