From dotnet-dnceng
Analyze CI build and test status from Azure DevOps and Helix for dotnet repository PRs. Use when checking CI status, investigating failures, determining if a PR is ready to merge, or given URLs containing dev.azure.com or helix.dot.net. Also use when asked "why is CI red", "test failures", "retry CI", "rerun tests", "is CI green", "build failed", "checks failing", or "flaky tests". DO NOT USE FOR: investigating stale codeflow PRs or dependency update health, tracing whether a commit has flowed from one repo to another, reviewing code changes for correctness or style.
npx claudepluginhub lewing/agent-plugins --plugin dotnet-dncengThis skill uses the workspace's default tool permissions.
Analyze CI build status and test failures in Azure DevOps and Helix for dotnet repositories (runtime, sdk, aspnetcore, roslyn, and more).
references/analysis-workflow.mdreferences/azdo-helix-reference.mdreferences/azure-cli.mdreferences/build-progression-analysis.mdreferences/delegation-patterns.mdreferences/failure-interpretation.mdreferences/helix-artifacts.mdreferences/manual-investigation.mdreferences/recommendation-generation.mdreferences/script-modes.mdreferences/sql-tracking.mdscripts/Get-CIStatus.ps1Diagnoses and fixes GitHub Actions CI failures in pull requests by fetching job logs, identifying root causes like build or test errors, and proposing targeted code changes.
Monitors running CI builds on GitHub Actions and CircleCI via polling, reports completion status, and diagnoses failures by fetching logs, job summaries, and artifacts.
Detects GitHub Actions CI failures in PRs, analyzes logs with gh CLI, fixes code, commits and pushes changes, then re-verifies up to 3 retries until passing.
Share bugs, ideas, or general feedback.
Analyze CI build status and test failures in Azure DevOps and Helix for dotnet repositories (runtime, sdk, aspnetcore, roslyn, and more).
๐จ NEVER use
gh pr review --approveor--request-changes. Only--commentis allowed. Approval and blocking are human-only actions.
Workflow: Gather PR context (Step 0) โ collect failure data โ synthesize recommendations. The agent drives the investigation; tools provide the data.
Accessing services: Start with MCP tools if available. Get repo-specific CI guidance early โ it provides the investigation workflow, tool selection, failure patterns, and classification algorithm for that repo. The guidance evolves with the toolset, so it always reflects current capabilities.
If MCP tools aren't loaded, the Helix CLI tool and the helix-cli skill provide the same capabilities via bash with progressive discovery.
For AzDO, multiple tool sets may exist for different organizations โ match the org in the build URL to the correct tools (see references/azdo-helix-reference.md). If queries return null, check the org before trying other approaches. For complex investigations, track what you've tried in SQL to avoid repeating failed approaches.
dev.azure.com, helix.dot.net, or GitHub PR links with failing checksNot for: GitHub Actions workflows, non-Helix repos, or build performance (use binlog analysis).
๐ก Per-repo CI patterns differ significantly. Get repo-specific guidance early โ it tells you which tools to use, what log patterns to search for, and what gotchas to expect. This is the fastest path and prevents wasted calls.
If MCP tools aren't available, the Helix CLI tool provides the same capabilities via bash. A legacy PowerShell script is also available for environments that support it.
For full parameter reference and mode details, see references/script-modes.md.
| PR Type | How to detect | Interpretation shift |
|---|---|---|
| Code PR | Human author, code changes | Failures likely relate to the changes |
| Flow/Codeflow PR | Author is dotnet-maestro[bot], "Update dependencies" | Missing packages may be behavioral, not infrastructure |
| Backport | Title mentions "backport", targets release branch | Check if test exists on target branch |
| Merge PR | Merging between branches | Conflicts cause failures, not individual changes |
| Dependency update | Bumps package versions, global.json | Build failures often trace to the dependency |
๐จ Don't re-fetch data you already have. Only make additional calls for deeper investigation (Helix log searches, binlog analysis, build progression).
Classify each failure. Determine whether it's a build error, test failure, crash, timeout, or infrastructure issue. Exit codes, log patterns, and Helix work item state all contribute โ the repo-specific CI guidance includes a classification algorithm with the patterns and recommended next steps for each category. Crashes (exit code -4, 139, 134) don't always mean tests failed โ check for recoverable test results before concluding.
Cross-reference with known issues. Check which failures are already matched by Build Analysis โ green means all failures are accounted for, red means some are unmatched. For each unmatched failure, search for related known issues by error message, test name, or job type. The user needs a per-failure verdict, not two separate lists.
Correlate with PR changes. If the same files appear in both the PR diff and the failure messages, the failure is likely PR-related. If not, check whether the same test fails on the target branch โ that distinguishes pre-existing flakes from regressions.
Verify before claiming. Don't call it "infrastructure" without a Build Analysis match or target-branch verification. Don't call it "safe to retry" unless ALL failures are accounted for.
๐จ Check build progression on multi-commit PRs. If the PR has multiple commits, query AzDO for builds on
refs/pull/{PR}/merge(sorted by queue time, top 10-20). Present a progression table showing which builds passed/failed at which SHAs โ this narrows failures to the commit that introduced them. See references/build-progression-analysis.md.
For interpreting error categories, crash recovery, and canceled jobs: references/failure-interpretation.md
For generating recommendations from [CI_ANALYSIS_SUMMARY] JSON: references/recommendation-generation.md
๐จ Keep tables narrow โ 4 short columns max (# | Job | Verdict | Issue). Put error descriptions, work item lists, and evidence in detail bullets below the table, not in cells. Wide tables wrap and become unreadable in terminals.
๐จ Use markdown links for PRs (
[#121195](url)), builds ([Build 1305302](url)), and jobs ([job name](azdo-job-url)). The script output and MCP tools provide URLs โ thread them through.
Lead with a 1-2 sentence verdict, then the summary table, then detail bullets (one per failure), then recommended actions. For the full format example: references/recommendation-generation.md.
๐จ Every failure verdict needs evidence โ no "Likely flaky" without proof. Each row in your summary table must cite a specific source: known issue number, Build Analysis match, or target-branch verification. If Build Analysis didn't match it and you haven't verified the target branch, the verdict is "Unmatched โ needs investigation", not "Likely flaky." A test that looks like it could be flaky is not the same as one you've verified is flaky.
โ Don't label failures "infrastructure" without evidence. Requires: Build Analysis match, identical failure on target branch, or confirmed outage. Exception:
tests-passed-reporter-failedis genuinely infrastructure.
โ Don't dismiss timed-out builds. A build "failed" due to AzDO timeout can have 100% passing Helix work items. Check Helix job status before concluding failure.
โ Missing packages on flow PRs โ infrastructure. Flow PRs request different packages. Check which package and why before assuming feed delay.
โ Don't present failures and known issues as separate lists. Cross-reference them: for each
failedJobDetailsentry, state whether it matches aknownIssuesentry or is unmatched. Anunclassifiedfailure can still match a known issue by error pattern.
โ Don't say "safe to retry" with Build Analysis red. Map each failing job to a specific known issue first.
โ Don't use raw REST APIs when higher-level tools are available. Check your available tools for Azure DevOps and Helix operations first. REST API fallback is for when those tools are genuinely unavailable, not a first resort.
[ActiveIssue] attributes for known skipped tests