Multi-cluster Kubernetes management, federation, and hybrid deployments
Manages multi-cluster Kubernetes operations using ArgoCD ApplicationSets for GitOps, Cilium Cluster Mesh for cross-cluster networking, and Velero for disaster recovery. Use this when you need to deploy applications across multiple clusters, connect services globally, or set up active-active failover configurations.
/plugin marketplace add pluginagentmarketplace/custom-plugin-kubernetes/plugin install kubernetes-assistant@pluginagentmarketplace-kubernetesThis skill inherits all available tools. When active, it can use any tool Claude has access to.
assets/config.yamlassets/schema.jsonreferences/GUIDE.mdreferences/PATTERNS.mdscripts/validate.pyProduction-grade multi-cluster Kubernetes management covering federation, cross-cluster networking, and disaster recovery patterns. This skill provides deep expertise in designing and operating globally distributed Kubernetes infrastructure.
Topology Patterns
Hub-Spoke:
┌─────────┐
│ Hub │
│ Cluster │
└────┬────┘
┌───────────────┼───────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Spoke 1 │ │ Spoke 2 │ │ Spoke 3 │
│ (Dev) │ │ (Stage) │ │ (Prod) │
└─────────┘ └─────────┘ └─────────┘
Mesh:
┌─────────┐ ┌─────────┐
│Cluster 1│◄────────►│Cluster 2│
│ (US) │ │ (EU) │
└────┬────┘ └────┬────┘
│ │
└────────┬───────────┘
┌───▼───┐
│Cluster│
│3 (AP) │
└───────┘
ApplicationSet Generator
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: api-server
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
env: production
template:
metadata:
name: 'api-server-{{name}}'
spec:
project: default
source:
repoURL: https://github.com/org/api-server
targetRevision: HEAD
path: k8s/overlays/production
destination:
server: '{{server}}'
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
Register External Cluster
# Add cluster to ArgoCD
argocd cluster add prod-cluster --name prod --kubeconfig ~/.kube/prod.yaml
# List clusters
argocd cluster list
# Verify connectivity
argocd cluster get prod
Cilium Cluster Mesh
# Enable cluster mesh on each cluster
cilium clustermesh enable --context cluster1
cilium clustermesh enable --context cluster2
# Connect clusters
cilium clustermesh connect --context cluster1 --destination-context cluster2
# Verify
cilium clustermesh status
Global Service
apiVersion: v1
kind: Service
metadata:
name: api-server
annotations:
service.cilium.io/global: "true"
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 8080
selector:
app: api-server
Active-Active Configuration
# External DNS for GSLB
apiVersion: externaldns.k8s.io/v1alpha1
kind: DNSEndpoint
metadata:
name: api-global
spec:
endpoints:
- dnsName: api.example.com
recordType: A
targets:
- 52.1.1.1 # US cluster
- 35.2.2.2 # EU cluster
setIdentifier: us-east
recordTTL: 60
---
# Each cluster has identical deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
# ... same configuration in both clusters
Velero Cross-Cluster Backup
# Install Velero in both clusters
velero install \
--provider aws \
--bucket velero-backups \
--backup-location-config region=us-east-1
# Create backup
velero backup create prod-backup \
--include-namespaces production \
--snapshot-volumes
# Restore in DR cluster
velero restore create --from-backup prod-backup
Rancher Fleet
apiVersion: fleet.cattle.io/v1alpha1
kind: GitRepo
metadata:
name: api-server
namespace: fleet-default
spec:
repo: https://github.com/org/api-server
branch: main
paths:
- k8s/
targets:
- clusterSelector:
matchLabels:
env: production
name: production
- clusterSelector:
matchLabels:
env: staging
name: staging
Multi-Cluster Issue?
│
├── Cluster unreachable
│ ├── Check network connectivity
│ ├── Verify kubeconfig
│ └── Check cluster health
│
├── Sync failures
│ ├── Check ArgoCD logs
│ ├── Verify RBAC permissions
│ └── Check resource conflicts
│
└── Service discovery fails
├── Check mesh connectivity
├── Verify DNS configuration
└── Check NetworkPolicies
# ArgoCD cluster status
argocd cluster list
argocd app list --dest-server <server>
# Cilium mesh status
cilium clustermesh status
cilium connectivity test
# Cross-cluster DNS
kubectl run debug --rm -it --image=nicolaka/netshoot -- \
nslookup <service>.default.svc.clusterset.local
| Challenge | Solution |
|---|---|
| Network latency | Use regional clusters |
| State sync | Eventually consistent design |
| Failover delay | Health checks, DNS TTL |
| Config drift | GitOps, policy enforcement |
| Metric | Target |
|---|---|
| Cross-cluster latency | <50ms (regional) |
| Failover time | <2 minutes |
| Config consistency | 100% |
| Cluster availability | 99.99% |
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.