Provides AWS CloudFormation templates for CloudWatch metrics, alarms, dashboards, log groups, anomaly detection, synthesized canaries, and Application Signals for production infrastructure monitoring.
From developer-kit-awsnpx claudepluginhub giuseppe-trisciuoglio/developer-kit --plugin developer-kit-awsThis skill is limited to using the following tools:
references/alarms.mdreferences/constraints.mdreferences/dashboards.mdreferences/examples.mdreferences/logs.mdreferences/reference.mdGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Optimizes cloud costs on AWS, Azure, GCP via rightsizing, tagging strategies, reserved instances, spot usage, and spending analysis. Use for expense reduction and governance.
Creates CloudWatch monitoring infrastructure using CloudFormation templates: metrics, alarms, dashboards, log groups, anomaly detection, synthesized canaries, and Application Signals.
Follow these steps to create CloudWatch monitoring infrastructure with CloudFormation:
Specify metric namespaces, dimensions, and threshold values:
Parameters:
ErrorRateThreshold:
Type: Number
Default: 5
Description: Error rate threshold for alarms (percentage)
LatencyThreshold:
Type: Number
Default: 1000
Description: Latency threshold in milliseconds
CpuUtilizationThreshold:
Type: Number
Default: 80
Description: CPU utilization threshold (percentage)
LogRetentionDays:
Type: Number
Default: 30
AllowedValues:
- 1
- 3
- 7
- 14
- 30
- 60
- 90
- 120
- 365
Description: Number of days to retain log events
Set up alarms for CPU, memory, disk, and custom metrics:
Resources:
HighCpuAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub "${AWS::StackName}-high-cpu"
AlarmDescription: Trigger when CPU utilization exceeds threshold
MetricName: CPUUtilization
Namespace: AWS/EC2
Dimensions:
- Name: InstanceId
Value: !Ref InstanceId
Statistic: Average
Period: 60
EvaluationPeriods: 3
Threshold: !Ref CpuUtilizationThreshold
ComparisonOperator: GreaterThanThreshold
AlarmActions:
- !Ref AlarmTopic
ErrorRateAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub "${AWS::StackName}-error-rate"
MetricName: ErrorRate
Namespace: !Ref CustomNamespace
Dimensions:
- Name: Service
Value: !Ref ServiceName
Statistic: Average
Period: 60
EvaluationPeriods: 5
Threshold: !Ref ErrorRateThreshold
ComparisonOperator: GreaterThanThreshold
Define SNS topics for notification delivery:
Resources:
AlarmNotificationTopic:
Type: AWS::SNS::Topic
Properties:
DisplayName: !Sub "${AWS::StackName}-alarms"
TopicName: !Sub "${AWS::StackName}-alarms"
AlarmTopicPolicy:
Type: AWS::SNS::TopicPolicy
Properties:
PolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: cloudwatch.amazonaws.com
Action: sns:Publish
Resource: !Ref AlarmNotificationTopic
Topics:
- !Ref AlarmNotificationTopic
Build visualization widgets for metrics across resources:
Resources:
MonitoringDashboard:
Type: AWS::CloudWatch::Dashboard
Properties:
DashboardName: !Sub "${AWS::StackName}-dashboard"
DashboardBody: !Sub |
{
"widgets": [
{
"type": "metric",
"x": 0,
"y": 0,
"width": 12,
"height": 6,
"properties": {
"title": "CPU Utilization",
"metrics": [["AWS/EC2", "CPUUtilization", "InstanceId", "${InstanceId}"]],
"period": 300,
"stat": "Average",
"region": "${AWS::Region}"
}
}
]
}
Configure retention policies and encryption settings:
Resources:
ApplicationLogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub "/aws/applications/${Environment}/${ApplicationName}"
RetentionInDays: !Ref LogRetentionDays
KmsKeyId: !Ref LogEncryptionKey
Create metrics from log data:
Resources:
ErrorMetricFilter:
Type: AWS::Logs::MetricFilter
Properties:
LogGroupName: !Ref ApplicationLogGroup
FilterPattern: '[level="ERROR", msg]'
MetricTransformations:
- MetricValue: "1"
MetricNamespace: !Sub "${AWS::StackName}/Application"
MetricName: ErrorCount
Build multi-condition alarm logic:
Resources:
SystemHealthComposite:
Type: AWS::CloudWatch::CompositeAlarm
Properties:
AlarmName: !Sub "${AWS::StackName}-system-health"
AlarmRule: !Or
- !Ref HighCpuAlarm
- !Ref ErrorRateAlarm
AlarmActions:
- !Ref AlarmTopic
Create saved queries for log analysis:
Resources:
ErrorAnalysisQuery:
Type: AWS::Logs::QueryDefinition
Properties:
Name: !Sub "${AWS::StackName}-errors"
LogGroupNames:
- !Ref ApplicationLogGroup
QueryString: |
fields @timestamp, @message
| filter @message like /ERROR/
| sort @timestamp desc
| limit 100
Before deploying, validate the CloudFormation template:
aws cloudformation validate-template --template-body file://template.yaml
For parameterized templates, test with sample values:
aws cloudformation validate-template \
--template-body file://monitoring.yaml \
--capabilities CAPABILITY_IAM
Deploy the stack and verify resources:
# Deploy stack
aws cloudformation create-stack \
--stack-name my-monitoring-stack \
--template-body file://monitoring.yaml \
--parameters file://parameters.json \
--capabilities CAPABILITY_IAM
# Wait for completion
aws cloudformation wait stack-create-complete \
--stack-name my-monitoring-stack
# Verify alarms are in OK state
aws cloudwatch describe-alarms --stack-name my-monitoring-stack
# Check dashboard accessibility
aws cloudwatch get-dashboard --dashboard-name my-monitoring-stack-dashboard
Test alarm actions before production:
# Set alarm to testing state
aws cloudwatch set-alarm-state \
--alarm-name my-alarm \
--state-value ALARM \
--state-reason "Testing alarm action"
set-alarm-state before productionAWSTemplateFormatVersion: '2010-09-09'
Description: Complete CloudWatch monitoring setup
Parameters:
Environment:
Type: String
Default: prod
AllowedValues: [dev, staging, prod]
Resources:
# SNS Topic for notifications
AlarmTopic:
Type: AWS::SNS::Topic
Properties:
DisplayName: !Sub "${Environment}-alarms"
# Alarm for high CPU
CpuAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub "${AWS::StackName}-cpu-high"
MetricName: CPUUtilization
Namespace: AWS/EC2
Statistic: Average
Period: 300
EvaluationPeriods: 2
Threshold: 80
ComparisonOperator: GreaterThanThreshold
AlarmActions:
- !Ref AlarmTopic
# Dashboard
Dashboard:
Type: AWS::CloudWatch::Dashboard
Properties:
DashboardName: !Ref AWS::StackName
DashboardBody: !Sub |
{
"widgets": [{
"type": "metric",
"properties": {
"metrics": [["AWS/EC2", "CPUUtilization"]],
"period": 300,
"stat": "Average"
}
}]
}
Resources:
AppLogGroup:
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: !Sub "/app/${Environment}"
RetentionInDays: 30
ErrorMetricFilter:
Type: AWS::Logs::MetricFilter
Properties:
LogGroupName: !Ref AppLogGroup
FilterPattern: '"ERROR"'
MetricTransformations:
- MetricValue: "1"
MetricNamespace: !Sub "${AWS::StackName}/App"
MetricName: ErrorCount
ErrorAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub "${AWS::StackName}-errors"
MetricName: ErrorCount
MetricNamespace: !Sub "${AWS::StackName}/App"
Statistic: Sum
Period: 300
EvaluationPeriods: 1
Threshold: 1
ComparisonOperator: GreaterThanOrEqualToThreshold
For detailed implementation guidance, see:
alarms.md - CloudWatch metrics and alarms including base metric alarms, latency alarms, API Gateway errors, EC2 instance alarms, Lambda function alarms, composite alarms, anomaly detection, metric math, alarm actions (SNS, Auto Scaling, EC2), missing data treatment, custom metrics, metric filters, and high-resolution alarms
dashboards.md - CloudWatch dashboards including base template, service-specific dashboards, widget types (metric, log, text, single value, alarm status), multi-region dashboards, stacked metrics, anomaly detection widgets, math expressions, layout patterns (grid, row, column), dynamic variables, cross-account sharing, and dashboard automation
logs.md - CloudWatch logs including log group configurations, metric filters, subscription filters (Lambda, Kinesis Firehose), cross-account log aggregation, log insights queries, resource policies, export and archive tasks, CloudWatch agent configuration, log encryption with KMS, lifecycle management, centralized logging, and search patterns
constraints.md - Resource limits (5000 alarms max, 500 dashboards max), operational constraints (metric resolution, evaluation periods, dashboard widgets, cross-account), security constraints (log data access, encryption, metric filters, alarm actions), cost considerations (detailed monitoring, custom metrics, log retention, dashboard queries), and data constraints (metric age, log ingestion, filter limits)