Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
By ahmidbbc
Claude Code plugin that bundles the official Datadog MCP configuration and adds intelligent workflows for incident response, performance investigation, and deployment validation.
npx claudepluginhub ahmidbbc/datadops --plugin datadopsAutomated post-deployment health verification using Datadog metrics, logs, and APM data. Compares service health before/after deployment, detects regressions, and provides rollback recommendations. Use after deployments to validate release success and catch issues early.
Comprehensive incident response workflow using Datadog observability data. Automatically investigates service issues by correlating logs, metrics, APM traces, and infrastructure data. Creates incident timeline and actionable recommendations. Use when investigating production issues, service outages, or performance degradations.
Deep performance analysis using APM traces, metrics, and logs to identify bottlenecks and optimization opportunities. Correlates database queries, external API calls, and resource usage to pinpoint root causes of slow performance. Use when investigating latency issues, throughput problems, or resource optimization needs.
Quick service health assessment providing a comprehensive overview of current service status using key metrics, active alerts, and recent events. Perfect for daily health checks, incident triage, or getting rapid service insights. Use when you need fast service status or as a starting point for deeper investigation.
External network access
Connects to servers outside your machine
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Collect comprehensive infrastructure performance metrics
Editorial "Observability & Monitoring" bundle for Claude Code from Antigravity Awesome Skills.
DevsForge site reliability engineering specialist for building resilient and scalable systems
Observability & reliability engineer โ monitoring, alerting, SRE, incident response, SLOs
๐ Specialised SRE skills for outage investigations, monitoring graphs, and post-mortems on Google Cloud Platform.
Production reliability and observability across all environments. Master Datadog, CloudWatch, monitoring, incident response, SRE practices, and audit logging for enterprise compliance.
Advanced Datadog workflows for Claude Code - Transforms raw monitoring data into actionable insights through intelligent DatadOps automation.
https://github.com/user-attachments/assets/d7403866-67db-448b-8fee-f58beeed79fc
A lightweight Claude Code plugin that bundles the official Datadog MCP server configuration and adds intelligent workflows on top of it. The official server provides access to raw Datadog data; DatadOps orchestrates that data into complete operational workflows.
| Aspect | Official Datadog MCP | DatadOps |
|---|---|---|
| Purpose | Raw data access | Intelligent workflows |
| Usage | search_datadog_logs, get_datadog_metric | "Investigate payment service issues" |
| Output | JSON data | Actionable insights + recommendations |
| Expertise | Requires Datadog knowledge | Guides non-experts |
| Scope | Individual queries | End-to-end scenarios |
User: "Payment service is returning 500 errors. Help me investigate."
DatadOps Response:
๐จ INCIDENT ANALYSIS - Payment Service
SEVERITY: HIGH (Customer Impact Detected)
DURATION: 23 minutes (since 14:37 UTC)
ROOT CAUSE ANALYSIS:
โข Database connection pool exhaustion (95% utilization)
โข Spike in payment processing requests (+300% vs baseline)
โข Payment gateway API latency increased to 2.4s (baseline: 200ms)
IMMEDIATE ACTIONS:
1. Scale payment service to 6 instances (currently 3)
2. Increase DB connection pool from 20โ40
3. Enable circuit breaker for payment gateway
IMPACT:
โข ~1,200 affected transactions
โข Payment success rate: 45% (baseline: 99.2%)
โข Estimated revenue impact: $15K/hour
MONITORING:
โข payment.success_rate < 95%: Page oncall
โข db.connections > 90%: Auto-scale
User: "Checkout service is slow. What's the bottleneck?"
DatadOps Response:
๐ PERFORMANCE ANALYSIS - Checkout Service
PERFORMANCE DEGRADATION: 340ms โ 1.2s (P95 latency)
BOTTLENECK BREAKDOWN:
1. ๐ฅ Database queries: 420ms (35% of total)
- Order validation: 180ms (missing index on user_id)
- Inventory check: 240ms (N+1 query pattern)
2. ๐ฅ External API calls: 380ms (32% of total)
- Payment gateway: 280ms (network latency)
- Inventory service: 100ms (within SLA)
3. ๐ฅ Application logic: 260ms (22% of total)
- Price calculation: 180ms (inefficient algorithm)
OPTIMIZATION RECOMMENDATIONS:
โข HIGH IMPACT: Add index on orders(user_id) โ -180ms
โข MEDIUM IMPACT: Batch inventory queries โ -150ms
โข LOW IMPACT: Cache price calculations โ -100ms
ESTIMATED IMPROVEMENT: 1.2s โ 590ms (51% faster)
User: "Validate the checkout-service v2.1.4 deployment."
DatadOps Response:
๐ DEPLOYMENT HEALTH - Checkout Service v2.1.4
OVERALL HEALTH SCORE: 92/100 โ
HEALTHY