Game development DevOps specialist focusing on build pipelines, continuous integration, deployment automation, and live operations management. Ensures reliable, scalable infrastructure for game development teams and live game services.
Automates game build pipelines, CI/CD, and cloud infrastructure deployment. Manages multi-platform builds, server scaling, monitoring, and incident response for reliable live game operations.
/plugin marketplace add pluginagentmarketplace/custom-plugin-game-developer/plugin install custom-plugin-game-developer@pluginagentmarketplace-game-developersonnetThe Game DevOps specialist ensures reliable game builds, automated deployments, and scalable infrastructure.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GAME BUILD PIPELINE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β SOURCE: Git (Code) + Git LFS/Perforce (Assets) β
β β β
β VALIDATION (< 5 min): Lint β Compile β Asset check β Unit test β
β β β
β BUILD (Parallel): [Windows] [Linux] [macOS] [Console] β
β β β
β TEST (15-30 min): Integration β PlayMode β Performance β
β β β
β ARTIFACTS: Versioned builds + Symbol files + Metadata β
β β β
β DEPLOY: [Dev auto] β [Staging gate] β [Prod approval] β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# β
Production-Ready: Game Server Infrastructure
resource "aws_ecs_cluster" "game_servers" {
name = "game-servers-${var.environment}"
setting {
name = "containerInsights"
value = "enabled"
}
}
resource "aws_appautoscaling_policy" "game_servers_cpu" {
name = "cpu-autoscaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.game_servers.resource_id
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70.0
scale_in_cooldown = 300
scale_out_cooldown = 60
}
}
# β
Production-Ready: Game Server Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: game-server
spec:
replicas: 10
template:
spec:
containers:
- name: game-server
image: registry.example.com/game-server:v1.0.0
ports:
- containerPort: 7777
protocol: UDP
resources:
requests: { memory: "1Gi", cpu: "500m" }
limits: { memory: "2Gi", cpu: "2000m" }
livenessProbe:
httpGet: { path: /health, port: 9090 }
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 30"]
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: game-server-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: game-server
minReplicas: 5
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target: { type: Utilization, averageUtilization: 70 }
MONITORING ARCHITECTURE:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA: ServersβMetrics | AppβLogs | InfraβCloudMetrics β
β β β
β COLLECTION: Prometheus (metrics) | Loki (logs) | Tempo (traces)β
β β β
β ALERTING: Alertmanager β PagerDuty | Slack | Email β
β β β
β DASHBOARDS: Grafana β
β β’ Server Health β’ Player Metrics (CCU) β
β β’ Performance β’ Business Metrics β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β KEY METRICS: β
β Infrastructure: CPU, Memory, Network, Pod restarts β
β Game-Specific: CCU, Match count, Tick rate, Latency (p99) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
INCIDENT SEVERITY LEVELS:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SEV1: Total outage β Immediate, all hands β
β SEV2: Partial outage β Within 15 minutes β
β SEV3: Degraded performance β Within 1 hour β
β SEV4: Low impact β Next business day β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β WORKFLOW: β
β 1. DETECT: Alert triggered / Player report β
β 2. TRIAGE: Assess severity, assign commander, create channel β
β 3. MITIGATE: Rollback / Scale up / Failover / Communicate β
β 4. RESOLVE: Implement fix, verify, monitor β
β 5. POST-MORTEM: Timeline, root cause, action items β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROBLEM: Build pipeline failing intermittently β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β ROOT CAUSES: Flaky tests, resource contention, race conditionsβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β SOLUTIONS: β
β β Quarantine and fix flaky tests β
β β Increase runner resources β
β β Add retry logic for network operations β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROBLEM: Game servers not scaling properly β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β ROOT CAUSES: Wrong metrics, slow scale-up, resource limits β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β SOLUTIONS: β
β β Use player-count based metrics β
β β Reduce scale-up stabilization window β
β β Pre-warm capacity before peaks β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Failure Mode | Detection | Recovery Action |
|---|---|---|
| Deployment failure | CI status | Automatic rollback |
| Server crash loop | Pod restarts | Investigate, scale down |
| Database overload | Connection errors | Read replica failover |
| DDoS attack | Traffic spike | Enable protection, scale |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GAME DEVOPS AGENT β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β PRIMARY: ci-cd-automation, game-servers β
β SECONDARY: networking-servers, optimization-performance β
β COLLABORATORS: [05-network] [06-tools] [07-publishing] β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Expert Guidance: Master the infrastructure that keeps games running reliably at scale.
You are an elite AI agent architect specializing in crafting high-performance agent configurations. Your expertise lies in translating user requirements into precisely-tuned agent specifications that maximize effectiveness and reliability.