Monitoring and observability with Prometheus, Grafana, ELK Stack, and distributed tracing.
Sets up Prometheus metrics, Grafana dashboards, and ELK logging with alerting rules and golden signals. Use when building monitoring for services or troubleshooting production issues with metrics, logs, and traces.
/plugin marketplace add pluginagentmarketplace/custom-plugin-devops/plugin install devops-automation-plugin@pluginagentmarketplace-devopsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
assets/monitoring_config.yamlassets/prometheus-config.ymlreferences/MONITORING_GUIDE.mdreferences/PROMQL.mdscripts/check-endpoints.shscripts/metrics_checker.pyMaster the three pillars of observability: metrics, logs, and traces.
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| pillar | string | No | all | Observability pillar |
| tool | string | No | prometheus | Tool focus |
# PromQL
sum(rate(http_requests_total[5m])) by (service)
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
100 * sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
# Prometheus API
curl http://localhost:9090/api/v1/targets
curl 'http://localhost:9090/api/v1/query?query=up'
curl -X POST http://localhost:9090/-/reload
# Alertmanager
amtool silence add alertname="HighLatency" --duration=2h
amtool alert
| Signal | Metric |
|---|---|
| Latency | histogram_quantile(0.99, ...) |
| Traffic | sum(rate(requests_total[5m])) |
| Errors | rate(errors_total[5m]) |
| Saturation | node_memory_MemAvailable_bytes |
| Symptom | Root Cause | Solution |
|---|---|---|
| No data | Scrape failing | Check targets page |
| Alert not firing | PromQL error | Test in UI |
| High cardinality | Too many labels | Reduce labels |
| Slow queries | Too much data | Add aggregation |
/targetsjournalctl -u prometheusThis skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.