azure-reliability
Expert knowledge for Azure Reliability development including best practices, decision making, architecture & design patterns, limits & quotas, and deployment. Use when building, debugging, or optimizing Azure Reliability applications. Not for Azure Resiliency (use azure-resiliency), Azure Monitor (use azure-monitor), Azure Service Health (use azure-service-health), Azure Site Recovery (use azure-site-recovery).
From azurenpx claudepluginhub atc-net/atc-agentic-toolkitThis skill uses the workspace's default tool permissions.
Azure Reliability Skill
This skill provides expert guidance for Azure Reliability. Covers best practices, decision making, architecture & design patterns, limits & quotas, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
How to Use This Skill
IMPORTANT for Agent: This file may be large. Use the Category Index below to locate relevant sections, then use
read_filewith specific line ranges (e.g.,L136-L144) to read the sections needed for the user's question This skill requires network access to fetch documentation content. Usemcp_microsoftdocs:microsoft_docs_fetchto retrieve full articles.
- Fallback: Use the built-in
WebFetchtool if the Microsoft Learn MCP server is not available.
Category Index
| Category | Lines | Description |
|---|---|---|
| Best Practices | L33-L61 | Patterns and guidance to design, configure, and harden high-availability, resilient, and disaster‑ready architectures for key Azure PaaS, data, and integration services |
| Decision Making | L62-L67 | Guidance on picking Azure regions and services for high availability, including zone support, geographic considerations, and how nonregional (global) services affect reliability. |
| Architecture & Design Patterns | L68-L73 | Designing Azure apps for availability zones, choosing zonal vs zone-redundant resources, and hardening deployments for zone failures and high availability. |
| Limits & Quotas | L74-L78 | Details on Azure Queue Storage message size limits, including max message size, behavior when limits are exceeded, and best practices for handling large payloads. |
| Deployment | L79-L82 | Guidance for migrating Azure Functions hosting plans to zone-redundant configurations to improve availability and resilience. |
Best Practices
Decision Making
| Topic | URL |
|---|---|
| Identify Azure services with availability zone support | https://learn.microsoft.com/en-us/azure/reliability/availability-zones-service-support |
| Select and understand Azure nonregional services | https://learn.microsoft.com/en-us/azure/reliability/regions-nonregional-services |
Architecture & Design Patterns
| Topic | URL |
|---|---|
| Enable and plan zone-resilient Azure workloads | https://learn.microsoft.com/en-us/azure/reliability/availability-zones-enable-zone-resiliency |
| Design and harden zonal Azure resource deployments | https://learn.microsoft.com/en-us/azure/reliability/availability-zones-zonal-resource-resiliency |
Limits & Quotas
| Topic | URL |
|---|---|
| Understand Azure Queue Storage message size limits | https://learn.microsoft.com/en-us/azure/reliability/reliability-storage-queue |
Deployment
| Topic | URL |
|---|---|
| Migrate Azure Functions plans to zone redundancy | https://learn.microsoft.com/en-us/azure/reliability/migrate-functions |