Help us improve
Share bugs, ideas, or general feedback.
From agentops-toolkit
AgentOps Doctor - surface regressions, latency spikes, error rates, and safety hits across AgentOps eval history, Azure Monitor traces, and Foundry control plane.
npx claudepluginhub azure/agentops --plugin agentops-acceleratorHow this skill is triggered — by the user, by Claude, or both
Slash command
/agentops-toolkit:agentops-agentThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill when the user asks any of:
Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
agentops-agent - Doctor skillUse this skill when the user asks any of:
This skill is the front door to agentops doctor and the
agentops agent serve Copilot Extension. It does not invent
findings - it shells out to the CLI which reads real data from:
.agentops/results/*/results.json (eval history)azure-ai-projects)Look for .agentops/agent.yaml. If absent, copy the template:
mkdir -p .agentops
cp $(python -c "import agentops, os, pathlib;
print(pathlib.Path(agentops.__file__).parent / 'templates' / 'agent.yaml')") .agentops/agent.yaml
Edit app_insights_resource_id and project_endpoint_env if the user
wants the Azure Monitor / Foundry sources to be live. Without those
values the sources skip gracefully.
agentops doctor --severity-fail critical
The command writes .agentops/agent/report.md. Exit codes:
0 - no findings at or above the configured severity floor2 - at least one finding meets the severity floor (use this in CI)1 - runtime / configuration errorOpen .agentops/agent/report.md. The report has:
lookback_runs evalsWhen summarising for the user, lead with the verdict, then the top
3 findings, each with the recommendation. Always cite the finding id
so the user can grep them later.
For each finding the report includes a Recommendation. Follow it
verbatim - for example, if the finding says "compare the latest run
against the baseline runs in .agentops/results/", actually open
those folders.
agentops cockpit)For a workspace-level operations view the user can open a local Cockpit:
pip install "agentops-toolkit[agent] @ git+https://github.com/Azure/agentops.git@develop"
agentops cockpit
# → http://127.0.0.1:8090
Cockpit reads local AgentOps artifacts first: .agentops/results/,
generated reports, .agentops/agent/history.jsonl, and workflow files.
When a Foundry project is configured, it also resolves telemetry
readiness and links to the matching Foundry and Azure Monitor views.
It is read-only and bound to localhost.
When telemetry is enabled the analyzer also emits OpenTelemetry
spans (ANALYZE watchdog) with per-severity / per-category counters,
useful for long-term retention in App Insights or any OTel backend.
Resolution order:
APPLICATIONINSIGHTS_CONNECTION_STRING (or the AgentOps-prefixed
variant) - explicit user configuration always wins.AGENTOPS_OTLP_ENDPOINT - generic OTLP/HTTP exporter.AZURE_AI_FOUNDRY_PROJECT_ENDPOINT is
set but no explicit env var is, AgentOps asks the Foundry project
for the connection string of the Application Insights resource
attached to it. Zero configuration when the user is already on
Foundry.If the user wants the watchdog inside Copilot Chat, they can:
pip install "agentops-toolkit[agent] @ git+https://github.com/Azure/agentops.git@develop"
agentops agent serve --no-verify # local dev
For production, point them at:
src/agentops/templates/agent-server/Dockerfilesrc/agentops/templates/agent-server/main.bicepsrc/agentops/templates/agent-server/README.mdThese are the deploy scaffold for hosting the watchdog as a Copilot Extension on Azure Container Apps.
agentops doctor [--workspace] [--config] [--out] [--lookback-days] [--severity-fail]agentops agent serve [--host] [--port] [--config] [--no-verify] [--workers]agentops cockpit [--host] [--port] [--workspace]skipped or error, surface that as the first
thing in the user-facing summary so they know the analyzer ran with
partial data.