Skill

mcp-testing

Load MCP eval testing patterns using @umbraco-cms/mcp-server-sdk/evals. Use when writing LLM-based acceptance tests for MCP tools.

npx claudepluginhub umbraco/umbraco-mcp-base --plugin umbraco-mcp-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

This skill loads eval testing patterns for MCP tools using `@umbraco-cms/mcp-server-sdk/evals`. Eval tests verify tools work correctly when driven by an LLM agent.

SKILL.md

Similar Skills

applying-brand-guidelines

41.6k

Applies Acme Corporation brand guidelines including colors, fonts, layouts, and messaging to generated PowerPoint, Excel, and PDF documents.

3 files

anthropics-claude-cookbooks

creating-financial-models

41.6k

Builds DCF models with sensitivity analysis, Monte Carlo simulations, and scenario planning for investment valuation and risk assessment.

2 files

anthropics-claude-cookbooks

analyzing-financial-statements

41.6k

Calculates profitability (ROE, margins), liquidity (current ratio), leverage, efficiency, and valuation (P/E, EV/EBITDA) ratios from financial statements in CSV, JSON, text, or Excel for investment analysis.

2 files

anthropics-claude-cookbooks

Stats

Parent Repo Stars2

Parent Repo Forks1

Last CommitFeb 19, 2026

Actions

View Source View Plugin View on GitHub View README

MCP Eval Testing Patterns

This skill loads eval testing patterns for MCP tools using @umbraco-cms/mcp-server-sdk/evals. Eval tests verify tools work correctly when driven by an LLM agent.

For integration tests, use /build-tools-tests instead.

When to Use

Use this skill when:

Creating eval/acceptance tests for MCP tools
Verifying tools work correctly in LLM-driven workflows
Debugging eval test failures

Setup

Eval tests live in tests/evals/ with a dedicated Jest config. The setup file is loaded automatically via setupFilesAfterEnv — test files do NOT need to import it.

`tests/evals/helpers/e2e-setup.ts`

import path from "path";
import { configureEvals, ClaudeModels } from "@umbraco-cms/mcp-server-sdk/evals";

configureEvals({
  mcpServerPath: path.resolve(process.cwd(), "dist/index.js"),
  mcpServerName: "my-mcp-server",
  serverEnv: { USE_MOCK_API: "true" },
  defaultModel: ClaudeModels.Haiku,
  defaultMaxTurns: 10,
  defaultMaxBudgetUsd: 0.25,
  defaultTimeoutMs: 60000,
});

`tests/evals/jest.config.ts`

import type { Config } from "jest";

const config: Config = {
  preset: "ts-jest/presets/js-with-ts-esm",
  testEnvironment: "node",
  extensionsToTreatAsEsm: [".ts"],
  rootDir: "../..",
  testMatch: ["<rootDir>/tests/evals/**/*.test.ts"],
  setupFilesAfterEnv: ["<rootDir>/tests/evals/helpers/e2e-setup.ts"],
  maxConcurrency: 1,
  maxWorkers: 1,
  testTimeout: 120000,
};

export default config;

Test Pattern

// tests/evals/entity-crud.test.ts
import { describe, it } from "@jest/globals";
import {
  runScenarioTest,
  setupConsoleMock,
  getDefaultTimeoutMs,
} from "@umbraco-cms/mcp-server-sdk/evals";

describe("entity evals", () => {
  setupConsoleMock();

  it(
    "should complete workflow",
    runScenarioTest({
      prompt: `Complete these tasks:
1. Create an item named "Test"
2. Delete the item
3. Say "Workflow completed"`,
      tools: ["create-item", "delete-item"],
      requiredTools: ["create-item", "delete-item"],
      successPattern: "Workflow completed",
    }),
    getDefaultTimeoutMs()
  );
});

Key Concepts

runScenarioTest Options

Option	Purpose
`prompt`	Step-by-step instructions for the LLM
`tools`	Tools available to the LLM agent
`requiredTools`	Tools that must be called for the test to pass
`successPattern`	String the LLM must output to indicate success

Writing Good Prompts

Use numbered step-by-step instructions
Be explicit about what to do with results ("get details for the first one")
Use unique identifiers with timestamps to avoid collisions
End with a specific success phrase ("Say 'Workflow completed'")
Search for IDs dynamically — don't hardcode them

Grouping Related Tools

Group tools that work together in a single eval test to verify the workflow:

it(
  "should create, list, and delete",
  runScenarioTest({
    prompt: `Complete these tasks:
1. Create a form named "Test Form ${Date.now()}"
2. List all forms and confirm the new one appears
3. Delete the form you created
4. Say "CRUD workflow completed"`,
    tools: ["create-form", "list-forms", "delete-form"],
    requiredTools: ["create-form", "list-forms", "delete-form"],
    successPattern: "CRUD workflow completed",
  }),
  getDefaultTimeoutMs()
);

Running Eval Tests

# Build first (evals run against dist/)
npm run build

# Run all evals
npm run test:evals

# Run specific eval file
npm run test:evals -- --testPathPattern="entity"

# Verbose mode shows full LLM conversation
E2E_VERBOSITY=verbose npm run test:evals

Best Practices

Always build before running evals (npm run build)
Use unique identifiers (timestamps) to avoid test data collisions
Clear step-by-step prompts work better than vague instructions
Search for IDs dynamically — don't assume IDs exist
Enable verbose mode during development to see the full conversation
Keep maxBudgetUsd low to catch inefficient tool usage

Debugging

# Verbose mode shows full conversation
E2E_VERBOSITY=verbose npm run test:evals

# Run specific eval file
npm run test:evals -- --testPathPattern="entity"

Common Issues

Issue	Solution
Eval timeout	Increase `maxTurns` or simplify prompt
Wrong tool selected	Improve tool description clarity
Missing parameters	Add examples to tool descriptions
Tool not found	Check tool name matches exactly
Budget exceeded	Simplify workflow or increase `maxBudgetUsd`