From copilot-studio
Creates CSV test sets for Copilot Studio agent evaluation by analyzing YAML files for topics, instructions, knowledge sources, and generating test cases with expected responses.
npx claudepluginhub microsoft/skills-for-copilot-studio --plugin copilot-studioThis skill is limited to using the following tools:
Create a test set CSV file that can be imported into Copilot Studio's **Evaluate** tab for in-product agent evaluation.
Enforces C++ Core Guidelines for writing, reviewing, and refactoring modern C++ code (C++17+), promoting RAII, immutability, type safety, and idiomatic practices.
Provides patterns for shared UI in Compose Multiplatform across Android, iOS, Desktop, and Web: state management with ViewModels/StateFlow, navigation, theming, and performance.
Implements Playwright E2E testing patterns: Page Object Model, test organization, configuration, reporters, artifacts, and CI/CD integration for stable suites.
Create a test set CSV file that can be imported into Copilot Studio's Evaluate tab for in-product agent evaluation.
Read the agent's YAML files to understand what it does:
Glob: **/agent.mcs.yml — find the agentagent.mcs.yml — get the agent's instructions, description, and capabilitiessettings.mcs.yml — check orchestration mode (generative vs classic)Glob: **/topics/*.mcs.yml — list all topicsCreate test cases that cover:
| Category | What to test | Example |
|---|---|---|
| Core functionality | Main topics and capabilities | Questions matching trigger phrases |
| Knowledge/generative | Knowledge source responses | Questions the agent should answer from its knowledge |
| System topics | Greeting, Escalation, Goodbye, Thank You, Fallback | "Hi", "I want to speak to a person", "Goodbye" |
| Edge cases | Out-of-scope, ambiguous, off-topic | "Tell me a joke", "Book a flight for me" |
| Boundary testing | Things the agent should NOT do | Actions beyond its capabilities |
Aim for 10–25 test cases with good coverage across categories.
The CSV import only supports two columns: question and expectedResponse. Test methods cannot be set via CSV import — they are configured in the UI after import. The default test method (General quality) is applied to all imported test cases.
Write expected responses with this in mind:
expectedResponse empty for questions that only need General quality (it works without expected responses)| Test method | What it measures | Requires expected response? |
|---|---|---|
| General quality (default) | AI-graded quality: relevance, completeness, groundedness, abstention | No (but recommended as a rubric) |
| Compare meaning | Semantic similarity — compares meaning/intent | Yes |
| Text similarity | Cosine similarity of text | Yes, configurable pass threshold |
| Exact match | Character-for-character match | Yes |
| Keyword match | Response contains expected keywords/phrases | Yes (keywords added in UI) |
| Capability use | Agent called expected tools/topics | Configured in UI |
| Custom | Custom grader with your own instructions and labels | Configured in UI |
Write the CSV file using the Write tool. The format must be:
"question","expectedResponse"
"User question here","Expected agent response or behavioral rubric"
"Question without expected response",
| Column | Required | Description |
|---|---|---|
question | Yes | The user message to send to the agent. Max 1,000 characters. |
expectedResponse | No | The expected response or behavioral rubric. Leave empty if not needed. |
Important: The Testing method column is not supported on import — it is ignored. All imported test cases get the default test method (General quality). Configure other test methods in the UI after import.
.csv formatBehavioral rubric (for General quality):
"Find me a hotel in Paris","The response should include hotel recommendations in Paris with relevant details like names, locations, or prices."
Realistic reply (for Compare meaning — set method in UI after import):
"Hi there","Hello! How can I help you today?"
Exact expected text (for Exact match — set method in UI after import):
"What is 2+2?","4"
After writing the CSV, tell the user:
To import into Copilot Studio:
- Open your agent in Copilot Studio
- Go to the Evaluate tab
- Click New evaluation > Single response
- Drag or browse for the CSV file
- Review the imported test cases and adjust if needed
- Optionally add more test methods (Capability use, Custom) in the UI
- Click Evaluate to run, or Save to run later
After import, some things can only be configured in the UI: