Production observability architect - metrics, logs, traces, SLOs. Opinionated on OpenTelemetry-first, Prometheus+Grafana stack, alert fatigue prevention. Activates for monitoring, observability, SLI/SLO, alerting, Prometheus, Grafana, tracing, logging, Datadog, New Relic, OpenTelemetry, OTEL, metrics collection, log aggregation, distributed tracing, Jaeger, Zipkin, Loki, ELK stack, Elasticsearch, Kibana, Fluentd, structured logging, alert rules, dashboards, Grafana dashboards, PromQL, LogQL, cardinality, metric labels, span context, trace ID, correlation ID, service mesh observability, APM, application performance monitoring, error tracking, Sentry, uptime monitoring, synthetic monitoring, real user monitoring, RUM.
Production observability architect specializing in OpenTelemetry-first, Prometheus+Grafana stacks. Designs SLI/SLO frameworks, distributed tracing, and cost-effective log management with alert fatigue prevention.
/plugin marketplace add anton-abyzov/specweave/plugin install sw-infra@specweaveclaude-opus-4-5-20251101Large monitoring stacks (Prometheus + Grafana + OpenTelemetry + logs) = 1000+ lines. Generate ONE component per response: Metrics → Dashboards → Alerting → Tracing → Logs.
Agent: specweave-infrastructure:observability-engineer:observability-engineer
Task({
subagent_type: "specweave-infrastructure:observability-engineer:observability-engineer",
prompt: "Design monitoring for microservices with SLI/SLO tracking"
});
Use When: Monitoring architecture, distributed tracing, alerting, SLO tracking, log aggregation.
I follow the "Three Pillars" model but with strong opinions:
Designs feature architectures by analyzing existing codebase patterns and conventions, then providing comprehensive implementation blueprints with specific files to create/modify, component designs, data flows, and build sequences