From copilot-studio
Analyzes CSV exports from Copilot Studio's Evaluate tab, identifies test failures with explanations, and proposes YAML fixes for agent topics and flows.
npx claudepluginhub microsoft/skills-for-copilot-studio --plugin copilot-studioThis skill is limited to using the following tools:
Analyze evaluation results exported from the Copilot Studio UI as CSV.
Enforces C++ Core Guidelines for writing, reviewing, and refactoring modern C++ code (C++17+), promoting RAII, immutability, type safety, and idiomatic practices.
Provides patterns for shared UI in Compose Multiplatform across Android, iOS, Desktop, and Web: state management with ViewModels/StateFlow, navigation, theming, and performance.
Implements Playwright E2E testing patterns: Page Object Model, test organization, configuration, reporters, artifacts, and CI/CD integration for stable suites.
Analyze evaluation results exported from the Copilot Studio UI as CSV.
Ask the user for the CSV file path if not already provided. The file is typically exported from Copilot Studio's Evaluate tab and named Evaluate <agent name> <date>.csv in their Downloads folder.
Read the CSV file. The in-product evaluation CSV has these columns:
| Column | Meaning |
|---|---|
question | The test utterance |
expectedResponse | Expected response (may be empty) |
actualResponse | What the agent responded |
testMethodType_1 | Eval method (e.g., GeneralQuality) |
result_1 | Pass or Fail |
passingScore_1 | Score threshold (may be empty) |
explanation_1 | Why it passed/failed (e.g., "Seems relevant; Seems incomplete; Knowledge sources not cited") |
The _1 suffix indicates the first eval method. There may be additional methods (_2, _3, etc.) with the same column pattern.
Focus on failed evaluations (result_1 = Fail, or any result_N = Fail).
For each failure, use the explanation column to understand the issue:
SearchAndSummarizeContent nodes.SendActivity messages.actualResponse (e.g., GenAIToolPlannerRateLimitReached) — These are runtime errors, not authoring issues. Flag them to the user as transient failures to retry.For each failure, identify the relevant YAML file(s):
Glob: **/agent.mcs.ymlPropose specific YAML changes to fix each failure. Present them to the user as a summary:
Wait for user decision. The user can:
Apply accepted changes using the Edit tool. After applying, remind the user to push and publish again before re-running evaluations.