From azure
Analyzes Azure resource health for VMs, apps, databases; diagnoses issues from logs/telemetry using MCP tools; generates remediation plans.
npx claudepluginhub atc-net/atc-agentic-toolkit --plugin azureThis skill uses the workspace's default tool permissions.
This workflow analyzes a specific Azure resource to assess its health status, diagnose potential issues using logs and telemetry data, and develop a comprehensive remediation plan for any problems discovered.
Debugs and troubleshoots Azure Container Apps and Function Apps production issues using log analysis, health checks, KQL queries, and diagnostic CLI commands. For image pull failures, cold starts, health probes, and errors.
Guides Azure MCP tool usage for listing, getting, creating, and querying resources in Storage, Key Vault, Cosmos DB, AKS clusters, and Log Analytics.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Share bugs, ideas, or general feedback.
This workflow analyzes a specific Azure resource to assess its health status, diagnose potential issues using logs and telemetry data, and develop a comprehensive remediation plan for any problems discovered.
azmcp-*) over direct Azure CLI when availableAction: Retrieve diagnostic and troubleshooting best practices Tools: Azure MCP best practices tool Process:
Action: Locate and identify the target Azure resource Tools: Azure MCP tools + Azure CLI fallback Process:
Resource Lookup:
azmcp-subscription-listaz resource list --name <resource-name> to find matching resourcesResource Type Detection:
Action: Evaluate current resource health and availability Tools: Azure MCP monitoring tools + Azure CLI Process:
Basic Health Check:
Service-Specific Health Indicators:
Action: Analyze logs and telemetry to identify issues and patterns Tools: Azure MCP monitoring tools for Log Analytics queries Process:
Find Monitoring Sources:
azmcp-monitor-workspace-list to identify Log Analytics workspacesazmcp-monitor-table-listExecute Diagnostic Queries:
Use azmcp-monitor-log-query with targeted KQL queries based on resource type:
General Error Analysis:
// Recent errors and exceptions
union isfuzzy=true
AzureDiagnostics,
AppServiceHTTPLogs,
AppServiceAppLogs,
AzureActivity
| where TimeGenerated > ago(24h)
| where Level == "Error" or ResultType != "Success"
| summarize ErrorCount=count() by Resource, ResultType, bin(TimeGenerated, 1h)
| order by TimeGenerated desc
Performance Analysis:
// Performance degradation patterns
Perf
| where TimeGenerated > ago(7d)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| summarize avg(CounterValue) by Computer, bin(TimeGenerated, 1h)
| where avg_CounterValue > 80
Application-Specific Queries:
// Application Insights - Failed requests
requests
| where timestamp > ago(24h)
| where success == false
| summarize FailureCount=count() by resultCode, bin(timestamp, 1h)
| order by timestamp desc
// Database - Connection failures
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SQL"
| where Category == "SQLSecurityAuditEvents"
| where action_name_s == "CONNECTION_FAILED"
| summarize ConnectionFailures=count() by bin(TimeGenerated, 1h)
Pattern Recognition:
Action: Categorize identified issues and determine root causes Process:
Issue Classification:
Root Cause Analysis:
Impact Assessment:
Action: Create a comprehensive plan to address identified issues Process:
Immediate Actions (Critical issues):
Short-term Fixes (High/Medium issues):
Long-term Improvements (All issues):
Implementation Steps:
Action: Present findings and get approval for remediation actions Process:
Display Health Assessment Summary:
๐ฅ Azure Resource Health Assessment
๐ Resource Overview:
โข Resource: [Name] ([Type])
โข Status: [Healthy/Warning/Critical]
โข Location: [Region]
โข Last Analyzed: [Timestamp]
๐จ Issues Identified:
โข Critical: X issues requiring immediate attention
โข High: Y issues affecting performance/reliability
โข Medium: Z issues for optimization
โข Low: N informational items
๐ Top Issues:
1. [Issue Type]: [Description] - Impact: [High/Medium/Low]
2. [Issue Type]: [Description] - Impact: [High/Medium/Low]
3. [Issue Type]: [Description] - Impact: [High/Medium/Low]
๐ ๏ธ Remediation Plan:
โข Immediate Actions: X items
โข Short-term Fixes: Y items
โข Long-term Improvements: Z items
โข Estimated Resolution Time: [Timeline]
โ Proceed with detailed remediation plan? (y/n)
Generate Detailed Report:
# Azure Resource Health Report: [Resource Name]
**Generated**: [Timestamp]
**Resource**: [Full Resource ID]
**Overall Health**: [Status with color indicator]
## ๐ Executive Summary
[Brief overview of health status and key findings]
## ๐ Health Metrics
- **Availability**: X% over last 24h
- **Performance**: [Average response time/throughput]
- **Error Rate**: X% over last 24h
- **Resource Utilization**: [CPU/Memory/Storage percentages]
## ๐จ Issues Identified
### Critical Issues
- **[Issue 1]**: [Description]
- **Root Cause**: [Analysis]
- **Impact**: [Business impact]
- **Immediate Action**: [Required steps]
### High Priority Issues
- **[Issue 2]**: [Description]
- **Root Cause**: [Analysis]
- **Impact**: [Performance/reliability impact]
- **Recommended Fix**: [Solution steps]
## ๐ ๏ธ Remediation Plan
### Phase 1: Immediate Actions (0-2 hours)
```bash
# Critical fixes to restore service
[Azure CLI commands with explanations]
# Performance and reliability improvements
[Azure CLI commands with explanations]
# Architectural and preventive measures
[Azure CLI commands and configuration changes]