From copilot-studio
Runs evaluations on Copilot Studio draft agents via Power Platform Evaluation API. Lists test sets, starts runs with optional auth, polls progress, fetches results, proposes YAML fixes. No publish needed.
npx claudepluginhub microsoft/skills-for-copilot-studio --plugin copilot-studioThis skill is limited to using the following tools:
Run evaluations against a Copilot Studio agent's **draft** — no publish needed.
Enforces C++ Core Guidelines for writing, reviewing, and refactoring modern C++ code (C++17+), promoting RAII, immutability, type safety, and idiomatic practices.
Provides patterns for shared UI in Compose Multiplatform across Android, iOS, Desktop, and Web: state management with ViewModels/StateFlow, navigation, theming, and performance.
Implements Playwright E2E testing patterns: Page Object Model, test organization, configuration, reporters, artifacts, and CI/CD integration for stable suites.
Run evaluations against a Copilot Studio agent's draft — no publish needed.
The caller (test agent) must provide --client-id and --workspace. If you don't have the client ID, return immediately and tell the caller to run test-auth first.
All eval-api commands run in the foreground. NEVER use run_in_background.
node ${CLAUDE_SKILL_DIR}/../../scripts/eval-api.bundle.js list-testsets --workspace <path> --client-id <id>
You MUST ask this question and wait for the user's answer before starting the run.
Ask the user:
Does your agent use authenticated knowledge sources or connector actions (tools) that require user identity? If so, you'll need to provide a connection ID — without it, the eval runs anonymously and tools and knowledge sources will not be used.
How to obtain the connection ID:
- Go to https://make.powerautomate.com
- Open Connections from the side menu
- Select the relevant Microsoft Copilot Studio connection
- Copy the connection ID from the URL (the GUID segment after
/connections/)If your agent doesn't use authenticated knowledge or tools, you can skip this.
Do not proceed to Step 3 until the user responds.
node ${CLAUDE_SKILL_DIR}/../../scripts/eval-api.bundle.js start-run --workspace <path> --client-id <id> --testset-id <id> --run-name "Draft eval <date>"
Add --connection-id <id> if the user provided a connection ID in Step 2.
Add --published only if the user explicitly asked for published-bot testing.
node ${CLAUDE_SKILL_DIR}/../../scripts/eval-api.bundle.js get-run --workspace <path> --client-id <id> --run-id <runId>
Poll every 15-30 seconds. Report progress: "Processing: 3/10 test cases..."
Stop when state is Completed, Failed, Abandoned, or Cancelled.
node ${CLAUDE_SKILL_DIR}/../../scripts/eval-api.bundle.js get-results --workspace <path> --client-id <id> --run-id <runId>
Present a summary table (total, passed, failed, errors). For failures:
| Metric | What to check |
|---|---|
GeneralQuality Fail | Which of relevance/completeness/groundedness/abstention failed |
ExactMatch Fail | Score 0.0–1.0 |
CapabilityUse Fail | missingInvocationSteps |
Error status | errorReason — often a test set config issue, not a YAML issue |
For YAML authoring failures: find the relevant topic, read it, propose specific edits. Wait for user approval before applying.
After applying: offer to push and re-run (go back to Step 3).