From agent-almanac
Deploys and configures Istio or Linkerd service mesh in Kubernetes for secure mTLS communication, traffic management, observability, circuit breaking, and policy enforcement. Use for microservices needing canary deployments or service-level observability.
npx claudepluginhub pjt222/agent-almanacThis skill is limited to using the following tools:
Deploy and configure a service mesh for secure service-to-service communication and advanced traffic management.
Configures Istio, Linkerd, and Consul Connect for Kubernetes microservices. Generates mTLS, traffic routing, resilience policies, and observability configs.
Implements and optimizes Istio/Linkerd service meshes for Kubernetes traffic management, mTLS security, observability, and multi-cluster federation.
Designs Istio and Linkerd service meshes for Kubernetes: traffic management, mTLS security, observability, multi-cluster federation, and resilience patterns.
Share bugs, ideas, or general feedback.
Deploy and configure a service mesh for secure service-to-service communication and advanced traffic management.
See Extended Examples for complete configuration files and templates.
Choose and install the service mesh control plane.
For Istio:
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.20.2 sh -
istioctl install --set profile=production -y
kubectl get pods -n istio-system
For Linkerd:
curl -sL https://run.linkerd.io/install | sh
linkerd check --pre
linkerd install --ha | kubectl apply -f -
linkerd check
Create a service mesh configuration with resource limits and tracing:
# service-mesh-config.yaml (abbreviated)
spec:
profile: production
meshConfig:
enableTracing: true
components:
pilot:
k8s:
resources: { requests: { cpu: 500m, memory: 2Gi } }
# See EXAMPLES.md Step 1 for complete configuration
Expected: Control plane pods running in istio-system (Istio) or linkerd (Linkerd) namespace. istioctl version or linkerd version shows matching client and server versions.
On failure:
kubectl logs -n istio-system -l app=istiod or kubectl logs -n linkerd -l linkerd.io/control-plane-component=controllerkubectl get crd | grep istio or kubectl get crd | grep linkerdConfigure namespaces for automatic sidecar proxy injection.
For Istio:
# Label namespace for automatic injection
kubectl label namespace default istio-injection=enabled
kubectl get namespace -L istio-injection
For Linkerd:
# Annotate namespace for injection
kubectl annotate namespace default linkerd.io/inject=enabled
Test sidecar injection with a sample deployment:
# test-deployment.yaml (abbreviated)
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 2
template:
spec:
containers:
- name: app
image: nginx:alpine
# See EXAMPLES.md Step 2 for complete test deployment
Apply and verify:
kubectl apply -f test-deployment.yaml
kubectl get pods -n default
# Expect 2/2 containers (app + proxy)
Expected: New pods show 2/2 containers (application + sidecar proxy). Describe output shows istio-proxy or linkerd-proxy container. Logs show successful proxy startup.
On failure:
kubectl get ns default -o yamlkubectl get mutatingwebhookconfigurationkubectl logs -n istio-system -l app=sidecar-injector (Istio)kubectl get deploy test-app -o yaml | istioctl kube-inject -f - | kubectl apply -f -Enable mutual TLS for secure service-to-service communication.
For Istio:
# mtls-policy.yaml (abbreviated)
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
# See EXAMPLES.md Step 3 for per-namespace and permissive mode examples
For Linkerd:
# Linkerd enforces mTLS by default for meshed pods
linkerd viz tap deploy/test-app -n default
# Check for ๐ (lock) symbol
Apply and verify:
kubectl apply -f mtls-policy.yaml
# Istio: verify mTLS status
istioctl authn tls-check $(kubectl get pod -n default -l app=test-app -o jsonpath='{.items[0].metadata.name}') -n default
Expected: All connections between meshed services show mTLS enabled. Istio tls-check shows STATUS as "OK". Linkerd tap output shows ๐ for all connections. Service logs show no TLS errors.
On failure:
kubectl get certificates -A (cert-manager)kubectl logs -n istio-system -l app=istiod | grep -i certkubectl get pods --all-namespaces -o json | jq '.items[] | select(.spec.containers | length == 1) | .metadata.name'Configure intelligent traffic routing, retries, and circuit breaking.
Create traffic management policies:
# traffic-management.yaml (abbreviated)
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
spec:
http:
- match:
- uri: { prefix: /api/v2 }
route:
- destination: { host: api-service, subset: v2 }
weight: 10
- destination: { host: api-service, subset: v1 }
weight: 90
retries: { attempts: 3, perTryTimeout: 2s }
# See EXAMPLES.md Step 4 for complete routing, circuit breaker, and gateway configs
For Linkerd traffic splitting:
apiVersion: split.smi-spec.io/v1alpha2
kind: TrafficSplit
spec:
service: api-service
backends:
- service: api-service-v1
weight: 900
- service: api-service-v2
weight: 100
Apply and test:
kubectl apply -f traffic-management.yaml
# Test traffic distribution
for i in {1..100}; do curl -s http://api.example.com/api/v2 | grep version; done | sort | uniq -c
# Monitor: istioctl dashboard kiali or linkerd viz dashboard
Expected: Traffic splits according to defined weights. Circuit breaker trips after consecutive errors. Retries occur for transient failures. Kiali/Linkerd dashboard shows traffic flow visualization.
On failure:
kubectl get svc -n productionkubectl get pods -n production --show-labelskubectl logs -n istio-system -l app=istiodistioctl analyze to check configuration: istioctl analyze -n productionConnect service mesh telemetry to monitoring and tracing systems.
Install observability addons:
# Istio: Prometheus, Grafana, Kiali, Jaeger
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/prometheus.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/grafana.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/kiali.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/addons/jaeger.yaml
# Linkerd
linkerd viz install | kubectl apply -f -
linkerd jaeger install | kubectl apply -f -
Configure custom metrics and dashboards:
# service-monitor.yaml (abbreviated)
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: istio-mesh-metrics
spec:
selector: { matchLabels: { app: istiod } }
endpoints:
- port: http-monitoring
interval: 30s
# See EXAMPLES.md Step 5 for Grafana dashboards and telemetry config
Access dashboards:
istioctl dashboard grafana # or: linkerd viz dashboard
istioctl dashboard kiali
istioctl dashboard jaeger
Expected: Dashboards show service topology, request rates, latency percentiles, error rates. Distributed traces available in Jaeger. Prometheus scraping mesh metrics successfully. Custom metrics appear in queries.
On failure:
kubectl get servicemonitor -Akubectl get pods -n istio-systemistioctl proxy-config log <pod-name> -n <namespace>kubectl get configmap istio -n istio-system -o yaml | grep -A 5 enableTracingPerform comprehensive health checks and set up ongoing monitoring.
# Istio validation
istioctl analyze --all-namespaces
istioctl verify-install
istioctl proxy-status
# Linkerd validation
linkerd check
linkerd viz check
linkerd diagnostics policy
# Check proxy sync status
kubectl get pods -n production -o json | \
jq '.items[] | {name: .metadata.name, proxy: .status.containerStatuses[] | select(.name=="istio-proxy").ready}'
# Monitor control plane health
kubectl get pods -n istio-system -w
kubectl top pods -n istio-system
Create health check script and alerts:
#!/bin/bash
# mesh-health-check.sh (abbreviated)
echo "=== Service Mesh Health Check ==="
kubectl get pods -n istio-system
istioctl analyze --all-namespaces
# See EXAMPLES.md Step 6 for complete health check script and alert configs
Expected: All analysis checks pass with no warnings. Proxy-status shows all proxies synced. mTLS check confirms encryption. Metrics show traffic flowing. Control plane pods stable with low resource usage.
On failure:
istioctl analyze outputkubectl logs <pod> -c istio-proxy -n <namespace>kubectl logs -n istio-system deploy/istiod --tail=100kubectl rollout restart deploy/<deployment> -n <namespace>Resource Exhaustion: Service mesh adds 100-200MB memory per pod for sidecars. Ensure cluster has sufficient capacity. Set appropriate resource limits in injection config.
Configuration Conflicts: Multiple VirtualServices for same host cause undefined behavior. Use single VirtualService per host with multiple match conditions instead.
Certificate Expiration: mTLS certificates auto-rotate but CA root must be managed. Monitor certificate expiry with: kubectl get certificate -A and set up alerts.
Sidecar Not Injected: Pods created before namespace labeling won't have sidecars. Must recreate: kubectl rollout restart deploy/<name> -n <namespace>.
DNS Resolution Issues: Service mesh intercepts DNS. Use fully qualified names (service.namespace.svc.cluster.local) for cross-namespace calls.
Port Naming Requirement: Istio requires named ports following protocol-name pattern (e.g., http-web, tcp-db). Unnamed ports default to TCP passthrough.
Gradual Rollout Required: Don't enable STRICT mTLS immediately in production. Use PERMISSIVE mode during migration, verify all services meshed, then switch to STRICT.
Observability Overhead: 100% tracing sampling causes performance issues. Use 1-10% for production: sampling: 1.0 in mesh config.
Gateway vs VirtualService Confusion: Gateway configures ingress (load balancer), VirtualService configures routing. Both required for external traffic.
Version Compatibility: Ensure mesh version compatible with Kubernetes version. Istio supports n-1 minor versions, Linkerd typically supports last 3 Kubernetes versions.
configure-ingress-networking - Gateway configuration complements mesh ingressdeploy-to-kubernetes - Application deployment patterns that work with service meshsetup-prometheus-monitoring - Prometheus integration for mesh metricsmanage-kubernetes-secrets - Certificate management for mTLSenforce-policy-as-code - OPA policies that work alongside mesh authorization