Help us improve
Share bugs, ideas, or general feedback.
From azure
Configures Azure API Management as an AI gateway for models, tools, and agents with semantic caching, token limits, content safety, rate limiting, jailbreak detection, and backend integration.
npx claudepluginhub joshuarweaver/cascade-code-devops-misc-1 --plugin microsoft-azure-skills-10How this skill is triggered — by the user, by Claude, or both
Slash command
/azure:azure-aigatewayThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Configure Azure API Management (APIM) as an AI Gateway for governing AI models, MCP tools, and agents.
Configures Azure API Management as an AI Gateway for AI models, MCP tools, and agents with semantic caching, token limits, content safety, rate limiting, and jailbreak detection.
Configures TrueFoundry AI Gateway for unified OpenAI-compatible LLM access, provider account integrations, content safety guardrails, and request observability (traces, costs, errors).
Provides expert guidance for Vercel AI Gateway configuration: model routing, provider failover, cost tracking, unified API for multiple AI providers like OpenAI, Anthropic, Gemini.
Share bugs, ideas, or general feedback.
Configure Azure API Management (APIM) as an AI Gateway for governing AI models, MCP tools, and agents.
To deploy APIM, use the azure-prepare skill. See APIM deployment guide.
| Category | Triggers |
|---|---|
| Model Governance | "semantic caching", "token limits", "load balance AI", "track token usage" |
| Tool Governance | "rate limit MCP", "protect my tools", "configure my tool", "convert API to MCP" |
| Agent Governance | "content safety", "jailbreak detection", "filter harmful content" |
| Configuration | "add Azure OpenAI backend", "configure my model", "add AI Foundry model" |
| Testing | "test AI gateway", "call OpenAI through gateway" |
| Policy | Purpose | Details |
|---|---|---|
azure-openai-token-limit | Cost control | Model Policies |
azure-openai-semantic-cache-lookup/store | 60-80% cost savings | Model Policies |
azure-openai-emit-token-metric | Observability | Model Policies |
llm-content-safety | Safety & compliance | Agent Policies |
rate-limit-by-key | MCP/tool protection | Tool Policies |
# Get gateway URL
az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv
# List backends (AI models)
az apim backend list --service-name <apim-name> --resource-group <rg> \
--query "[].{id:name, url:url}" -o table
# Get subscription key
az apim subscription keys list \
--service-name <apim-name> --resource-group <rg> --subscription-id <sub-id>
GATEWAY_URL=$(az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv)
curl -X POST "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <key>" \
-d '{"messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'
See references/patterns.md for full steps.
# Discover AI resources
az cognitiveservices account list --query "[?kind=='OpenAI']" -o table
# Create backend
az apim backend create --service-name <apim> --resource-group <rg> \
--backend-id openai-backend --protocol http --url "https://<aoai>.openai.azure.com/openai"
# Grant access (managed identity)
az role assignment create --assignee <apim-principal-id> \
--role "Cognitive Services User" --scope <aoai-resource-id>
Recommended policy order in <inbound>:
See references/policies.md for complete example.
| Issue | Solution |
|---|---|
| Token limit 429 | Increase tokens-per-minute or add load balancing |
| No cache hits | Lower score-threshold to 0.7 |
| Content false positives | Increase category thresholds (5-6) |
| Backend auth 401 | Grant APIM "Cognitive Services User" role |
See references/troubleshooting.md for details.