Skill

mcp-builder

From mlx

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

Install

npx claudepluginhub damionrashford/mlx --plugin mlx

Tool Access

This skill is limited to using the following tools:

Bash(uv run * scripts/connections.py *) Bash(uv run * scripts/evaluation.py *) BashReadWriteEditGlobGrep

Preview

Create MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. The quality of an MCP server is measured by how well it enables LLMs to accomplish real-world tasks.

Supporting Assets

LICENSE.txtevals/evals.jsonreferences/evaluation.mdreferences/mcp_best_practices.mdreferences/node_mcp_server.mdreferences/python_mcp_server.mdscripts/connections.pyscripts/evaluation.pyscripts/example_evaluation.xmlscripts/requirements.txt

SKILL.md

Similar Skills

design-system

Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.

team-skills-platform

163.7k

ui-demo

Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.

team-skills-platform

163.7k

kotlin-patterns

Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.

team-skills-platform

163.7k

Stats

Stars1

Forks0

Last CommitApr 9, 2026

Actions

View Source View Plugin View on GitHub View README

MCP Server Development Guide

Overview

Process

🚀 High-Level Workflow

Creating a high-quality MCP server involves four main phases:

Phase 1: Deep Research and Planning

1.1 Understand Modern MCP Design

API Coverage vs. Workflow Tools: Balance comprehensive API endpoint coverage with specialized workflow tools. Workflow tools can be more convenient for specific tasks, while comprehensive coverage gives agents flexibility to compose operations. Performance varies by client—some clients benefit from code execution that combines basic tools, while others work better with higher-level workflows. When uncertain, prioritize comprehensive API coverage.

Tool Naming and Discoverability: Clear, descriptive tool names help agents find the right tools quickly. Use consistent prefixes (e.g., github_create_issue, github_list_repos) and action-oriented naming.

Context Management: Agents benefit from concise tool descriptions and the ability to filter/paginate results. Design tools that return focused, relevant data. Some clients support code execution which can help agents filter and process data efficiently.

Actionable Error Messages: Error messages should guide agents toward solutions with specific suggestions and next steps.

1.2 Study MCP Protocol Documentation

Navigate the MCP specification:

Start with the sitemap to find relevant pages: https://modelcontextprotocol.io/sitemap.xml

Then fetch specific pages with .md suffix for markdown format (e.g., https://modelcontextprotocol.io/specification/draft.md).

Key pages to review:

Specification overview and architecture
Transport mechanisms (streamable HTTP, stdio)
Tool, resource, and prompt definitions

1.3 Study Framework Documentation

Recommended stack:

Language: TypeScript (high-quality SDK support and good compatibility in many execution environments e.g. MCPB. Plus AI models are good at generating TypeScript code, benefiting from its broad usage, static typing and good linting tools)
Transport: Streamable HTTP for remote servers, using stateless JSON (simpler to scale and maintain, as opposed to stateful sessions and streaming responses). stdio for local servers.

Load framework documentation:

MCP Best Practices: 📋 View Best Practices - Core guidelines

For TypeScript (recommended):

TypeScript SDK: Use WebFetch to load https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md
⚡ TypeScript Guide - TypeScript patterns and examples

For Python:

Python SDK: Use WebFetch to load https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md
🐍 Python Guide - Python patterns and examples

1.4 Plan Your Implementation

Understand the API: Review the service's API documentation to identify key endpoints, authentication requirements, and data models. Use web search and WebFetch as needed.

Tool Selection: Prioritize comprehensive API coverage. List endpoints to implement, starting with the most common operations.

Phase 2: Implementation

2.1 Set Up Project Structure

See language-specific guides for project setup:

⚡ TypeScript Guide - Project structure, package.json, tsconfig.json
🐍 Python Guide - Module organization, dependencies

2.2 Implement Core Infrastructure

Create shared utilities:

API client with authentication
Error handling helpers
Response formatting (JSON/Markdown)
Pagination support

2.3 Implement Tools

For each tool:

Input Schema:

Use Zod (TypeScript) or Pydantic (Python)
Include constraints and clear descriptions
Add examples in field descriptions

Output Schema:

Define outputSchema where possible for structured data
Use structuredContent in tool responses (TypeScript SDK feature)
Helps clients understand and process tool outputs

Tool Description:

Concise summary of functionality
Parameter descriptions
Return type schema

Implementation:

Async/await for I/O operations
Proper error handling with actionable messages
Support pagination where applicable
Return both text content and structured data when using modern SDKs

Annotations:

readOnlyHint: true/false
destructiveHint: true/false
idempotentHint: true/false
openWorldHint: true/false

Phase 3: Review and Test

3.1 Code Quality

Review for:

No duplicated code (DRY principle)
Consistent error handling
Full type coverage
Clear tool descriptions

3.2 Build and Test

TypeScript:

Run npm run build to verify compilation
Test with MCP Inspector: npx @modelcontextprotocol/inspector

Python:

Verify syntax: python -m py_compile your_server.py
Test with MCP Inspector

See language-specific guides for detailed testing approaches and quality checklists.

Phase 4: Create Evaluations

After implementing your MCP server, create comprehensive evaluations to test its effectiveness.

Load ✅ Evaluation Guide for complete evaluation guidelines.

4.1 Understand Evaluation Purpose

Use evaluations to test whether LLMs can effectively use your MCP server to answer realistic, complex questions.

4.2 Create 10 Evaluation Questions

To create effective evaluations, follow the process outlined in the evaluation guide:

Tool Inspection: List available tools and understand their capabilities
Content Exploration: Use READ-ONLY operations to explore available data
Question Generation: Create 10 complex, realistic questions
Answer Verification: Solve each question yourself to verify answers

4.3 Evaluation Requirements

Ensure each question is:

Independent: Not dependent on other questions
Read-only: Only non-destructive operations required
Complex: Requiring multiple tool calls and deep exploration
Realistic: Based on real use cases humans would care about
Verifiable: Single, clear answer that can be verified by string comparison
Stable: Answer won't change over time

4.4 Output Format

Create an XML file with this structure:

<evaluation>
  <qa_pair>
    <question>Find discussions about AI model launches with animal codenames. One model needed a specific safety designation that uses the format ASL-X. What number X was being determined for the model named after a spotted wild cat?</question>
    <answer>3</answer>
  </qa_pair>
<!-- More qa_pairs... -->
</evaluation>

Tool Design for Agent Consumers

When building an MCP server, the tool API is a contract with LLM agents, not with human developers. Agent cognition differs from human cognition in ways that require different design principles.

The Consolidation Principle

If a human engineer can't immediately decide which tool to use in a given situation, an agent can't either.

This is the single most important principle. Tool overlap creates decision paralysis in agents. Signs of over-fragmentation:

Multiple tools for reading vs. searching the same data
Separate tools for "get" vs. "list" that differ only in scope
Overlapping names with ambiguous differentiation

Practical ceiling: 10-20 tools per server. Beyond 20, agents make statistically more tool selection errors.

Description Engineering

Every tool description must answer four questions. If any are missing, agents will misuse the tool:

@mcp.tool
def search_documents(
    query: str,
    collection: str = "default",
    limit: int = 10
) -> list[dict]:
    """
    [WHAT] Performs semantic similarity search over indexed documents.
    [WHEN] Use when you need to find documents by meaning or concept,
           not exact text match. Use grep_documents for exact string search.
    [RETURNS] List of {id, content, score, metadata} dicts ordered by
              relevance. Empty list if no matches above threshold 0.7.
    [ERRORS] Raises CollectionNotFoundError if collection doesn't exist.
             Use list_collections first if unsure.
    """

Architectural Reduction

The most counter-intuitive principle: sometimes fewer, more primitive tools outperform elaborate specialized tools. Reasons:

Models have strong priors for filesystem operations and shell commands
Specialized tools introduce abstraction that the model must first learn to pierce
Direct access + good naming conventions achieves the same goal with less cognitive overhead

Ask before adding a specialized tool: does this enable new capabilities, or does it just constrain reasoning the model could handle with bash/filesystem access?

Naming Conventions

Always use fully-qualified names in MCP: ServerName:tool_name. Without the server prefix, agents fail when multiple MCP servers are active.

# Correct (fully qualified)
server = FastMCP("DataStore")  # tools exposed as DataStore:search, DataStore:write

# Correct tool naming pattern
# Verb + noun: search_documents, write_record, delete_entry, list_collections
# NOT: search, get, fetch (too generic — causes collision with other servers)

Anti-Patterns

Anti-pattern	Problem	Fix
Overlapping tools	Agent can't decide which to use	Consolidate; make difference explicit in description
Missing return format	Agent can't parse output	Specify exact schema in description
Vague names (`get`, `fetch`, `process`)	Namespace collision across servers	Use `ServerName:get_user_record`
No error documentation	Agent loops on error	Document all exception types and recovery path
Too many tools (>20)	Selection error rate spikes	Group into sub-servers; each ≤15 tools

Reference Files

📚 Documentation Library

Load these resources as needed during development:

Core MCP Documentation (Load First)

MCP Protocol: Start with sitemap at https://modelcontextprotocol.io/sitemap.xml, then fetch specific pages with .md suffix
📋 MCP Best Practices - Universal MCP guidelines including:
- Server and tool naming conventions
- Response format guidelines (JSON vs Markdown)
- Pagination best practices
- Transport selection (streamable HTTP vs stdio)
- Security and error handling standards

SDK Documentation (Load During Phase 1/2)

Python SDK: Fetch from https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md
TypeScript SDK: Fetch from https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md

Language-Specific Implementation Guides (Load During Phase 2)

🐍 Python Implementation Guide - Complete Python/FastMCP guide with:
- Server initialization patterns
- Pydantic model examples
- Tool registration with @mcp.tool
- Complete working examples
- Quality checklist
⚡ TypeScript Implementation Guide - Complete TypeScript guide with:
- Project structure
- Zod schema patterns
- Tool registration with server.registerTool
- Complete working examples
- Quality checklist

Evaluation Guide (Load During Phase 4)

✅ Evaluation Guide - Complete evaluation creation guide with:
- Question creation guidelines
- Answer verification strategies
- XML format specifications
- Example questions and answers
- Running an evaluation with the provided scripts