From azure
Expert knowledge for Azure Sre Agent development including troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when configuring SRE Agent tools/docs, querying KQL logs, wiring DevOps/incident integrations, or deploying as a Teams bot, and other Azure Sre Agent related development tasks. Not for Azure Monitor (use azure-monitor), Azure Reliability (use azure-reliability), Azure Resiliency (use azure-resiliency), Azure Site Recovery (use azure-site-recovery).
npx claudepluginhub atc-net/atc-agentic-toolkit --plugin azureThis skill uses the workspace's default tool permissions.
This skill provides expert guidance for Azure SRE Agent. Covers troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
Manages SRE workflows in four modes: oncall alert triage, root cause diagnosis, preventive patrols, and self-improvement iteration using PagerDuty and infrastructure context.
Implements SRE practices for production reliability: SLO/SLI definitions, monitoring/alerting, chaos engineering, incident runbooks, capacity planning. Handles brownfield extensions.
Routes Azure VM/VMSS queries to workflows for recommendations, pricing, autoscale, orchestration, connectivity troubleshooting, capacity reservations, and Essential Machine Management.
Share bugs, ideas, or general feedback.
This skill provides expert guidance for Azure SRE Agent. Covers troubleshooting, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
IMPORTANT for Agent: This file may be large. Use the Category Index below to locate relevant sections, then use
read_filewith specific line ranges (e.g.,L136-L144) to read the sections needed for the user's question This skill requires network access to fetch documentation content. Usemcp_microsoftdocs:microsoft_docs_fetchto retrieve full articles.
WebFetch tool if the Microsoft Learn MCP server is not available.| Category | Lines | Description |
|---|---|---|
| Troubleshooting | L29-L33 | Diagnosing Azure SRE Agent deployment/operation issues and querying its action logs with KQL to investigate failures, performance, and behavior. |
| Decision Making | L35-L40 | Guidance on SRE Agent pricing and cost drivers, when to trigger deep investigations, and how to assess incident impact, value, and performance metrics. |
| Limits & Quotas | L42-L46 | Monitoring SRE Agent usage and Azure AI Unit quotas, viewing consumption, and checking which Azure regions currently support deploying the SRE Agent |
| Security | L48-L53 | Data residency, privacy, and security model for Azure SRE Agent, including managed identity permissions setup and configuring user roles/RBAC access. |
| Configuration | L55-L63 | Configuring Azure SRE Agent behavior, code interpreter (Python/shell), network/firewall access, and uploading/managing knowledge documents for grounding |
| Integrations & Coding Patterns | L65-L83 | Integrating SRE Agent with DevOps, observability, incident tools (Azure DevOps, ADX, ServiceNow, PagerDuty, MCP), plus building/configuring Kusto & Python tools and notifications (Teams/Outlook). |
| Deployment | L85-L88 | How to deploy and configure the Azure SRE Agent as a Microsoft Teams bot, including setup steps, required permissions, and integration details. |
| Topic | URL |
|---|---|
| Query Azure SRE Agent action logs with KQL | https://learn.microsoft.com/en-us/azure/sre-agent/audit-agent-actions |
| Troubleshoot Azure SRE Agent deployment and operations | https://learn.microsoft.com/en-us/azure/sre-agent/faq-troubleshooting |
| Topic | URL |
|---|---|
| Understand billing and cost model for Azure SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/billing |
| Decide when to use deep investigation in SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/deep-investigation |
| Evaluate Azure SRE Agent incident value and performance | https://learn.microsoft.com/en-us/azure/sre-agent/track-incident-value |
| Topic | URL |
|---|---|
| Monitor Azure SRE Agent usage and Azure AI Unit limits | https://learn.microsoft.com/en-us/azure/sre-agent/monitor-agent-usage |
| Check supported Azure regions for SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/supported-regions |
| Topic | URL |
|---|---|
| Understand data residency and privacy for Azure SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/data-privacy |
| Configure Azure SRE Agent permissions with managed identity | https://learn.microsoft.com/en-us/azure/sre-agent/permissions |
| Configure user roles and RBAC for Azure SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/user-roles |
| Topic | URL |
|---|---|
| Configure agent hooks to control Azure SRE Agent behavior | https://learn.microsoft.com/en-us/azure/sre-agent/agent-hooks |
| Use SRE Agent code interpreter for Python and shell | https://learn.microsoft.com/en-us/azure/sre-agent/code-interpreter |
| Configure network and firewall requirements for SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/network-requirements |
| Upload and manage knowledge documents in SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/tutorial-upload-knowledge-document |
| Upload and manage knowledge documents in Azure SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/upload-knowledge-document |
| Enable and use Code Interpreter in Azure SRE Agent | https://learn.microsoft.com/en-us/azure/sre-agent/use-code-interpreter |
| Topic | URL |
|---|---|
| Deploy Azure SRE Agent as a Microsoft Teams bot | https://learn.microsoft.com/en-us/azure/sre-agent/teams-bot |