Help us improve
Share bugs, ideas, or general feedback.
From workflows-mcp-server
Audits MCP servers for security gaps across eight axes including injection vectors, blast radius, auth shape, input sinks, tenant isolation, and HTTP deployment surface. Use before releases or after handler changes.
npx claudepluginhub cyanheads/cyanheads --plugin workflows-mcp-serverHow this skill is triggered — by the user, by Claude, or both
Slash command
/workflows-mcp-server:security-passThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
An MCP server is a new attack surface with unique properties — tool output feeds back into the LLM's context, scopes gate what the model can do on the user's behalf, and per-request state must stay tenant-scoped. This skill walks a server through eight axes shaped around what the server builder actually controls. Framework-level concerns (transport, JSON-RPC parsing, auto-correlation, error cla...
Audits MCP servers for security gaps across eight axes including injection vectors, blast radius, auth shape, input sinks, tenant isolation, and HTTP deployment surface. Use before releases or after handler changes.
Audits MCP tool handlers for malicious input, hardcoded secrets, and unrestricted file/shell access. Invoke when building or reviewing MCP server definitions and tool schemas.
Provides MCP architecture patterns including client-host-server model, transports, resources, and tools with FastMCP examples in Python and TypeScript. Useful for building MCP servers and implementing tools.
Share bugs, ideas, or general feedback.
An MCP server is a new attack surface with unique properties — tool output feeds back into the LLM's context, scopes gate what the model can do on the user's behalf, and per-request state must stay tenant-scoped. This skill walks a server through eight axes shaped around what the server builder actually controls. Framework-level concerns (transport, JSON-RPC parsing, auto-correlation, error classification) are out of scope — mcp-ts-core handles those.
Read the code. Don't trust patterns from memory.
Gather before starting. Ask if unclear:
Surface what you're auditing before diving in. Paths below assume the mcp-ts-core layout — adjust to your repo.
find src/mcp-server/tools/definitions -name "*.tool.ts" | sort
find src/mcp-server/resources/definitions -name "*.resource.ts" 2>/dev/null | sort
find src/mcp-server/prompts/definitions -name "*.prompt.ts" 2>/dev/null | sort
find src/services -maxdepth 1 -mindepth 1 -type d | sort
Note: tool / resource / prompt counts, auth mode, storage provider, upstream APIs, which tools have destructiveHint, which handlers use ctx.sample or ctx.elicit, which services hold module-scope state, whether the server reads roots.
If transport is streamable HTTP or SSE, also capture:
127.0.0.1 for local, or 0.0.0.0 / public interface?)/healthz, /sse, metadata endpoints) — do they leak tool lists or tenant hints?aud) checked, resource indicators usedIf CANVAS_PROVIDER_TYPE=duckdb is set, also capture:
MCP_AUTH_MODE=none collapses the composite (tenantId, canvasId) scope to ('default', canvasId), where the ID is the only differentiatorCANVAS_MAX_CANVASES_PER_TENANT, CANVAS_TTL_MS, CANVAS_ABSOLUTE_CAP_MS, CANVAS_EXPORT_PATH valuesUse TaskCreate — one task per axis. Mark complete as you go.
Run fuzzTool in parallel. @cyanheads/mcp-ts-core/testing/fuzz catches crashes, memory leaks, and prototype pollution automatically on each tool — start it now so results are ready when you reach Axis 5.
Anything the server sends to the client that reaches the LLM's context is a potential injection surface: tool output, resource content, prompt text, and the metadata the LLM reads to decide what to call. Relayed upstream content (tickets, scraped text, emails, DB rows) can carry adversarial instructions even when your code is honest.
Look in:
*.tool.ts — output schema + format()*.resource.ts — content returned from resources/read*.prompt.ts — templated message contentdescription, title, annotations, and inputSchema field descriptions (templated from untrusted data?)Check:
format() wrap untrusted content in delimiters (blockquote, fenced code, <data> tags)?resources/read) framed the same way tool output is?Smell: return { body: await fetch(url).then(r => r.text()) } rendered directly in format(). Or: description: \Look up ${tenant.customLabel}`wherecustomLabel` is tenant-supplied.
Every auth: [...] entry is a blast-radius dial.
Look in: every *.tool.ts — auth: array.
grep -rn "auth: \[" src/mcp-server/tools/definitions/
Check:
['admin'], ['*'], or []?MCP_AUTH_DISABLE_SCOPE_CHECKS=true set in production? When on, both withRequiredScopes and checkScopes early-return — every authenticated user gets every tool, and runtime tenant patterns like team:${input.teamId}:write no longer guard. Acceptable only when paired with a real server-side ACL (path filter, allowlist, upstream API enforcement).Smell: every tool shares the same scope string. Or: MCP_AUTH_DISABLE_SCOPE_CHECKS=true set without a documented compensating ACL — confirm the deployment relies on a meaningful access control layer below the framework before approving.
ctx.elicit moves consent off the LLM and onto the user. Destructive tools without it trust the LLM not to be tricked.
Look in: handlers with destructiveHint: true or side-effecting verbs in names (delete_*, send_*, pay_*, publish_*, drop_*).
grep -rn "destructiveHint" src/mcp-server/tools/definitions/
grep -rn "ctx.elicit" src/mcp-server/tools/definitions/
Check:
ctx.elicit before the side effect?Smell: destructiveHint: true file with no ctx.elicit?.(...) in it. Or: const { confirmed } = await ctx.elicit(...) without a schema — confirmed could be anything.
What credentials the server holds, and the blast radius if one leaks.
Look in: src/services/*, src/config/server-config.ts.
Check:
aud, or passthrough the caller's?Smell: one global API_KEY used across all tenants + retry loop with no upper bound.
LLM-supplied inputs feel internal but aren't. Classic sinks apply, amplified. Sampling responses and roots-derived paths are MCP-specific sinks that look internal but carry LLM/client trust.
Look in: all handlers.
# URL sinks — SSRF
grep -rn "z.string().url()" src/
# Path sinks — traversal
grep -rn "readFile\|writeFile\|readdirSync\|createReadStream\|statSync" src/
# Shell sinks — command injection
grep -rnE "\b(exec|spawn|execSync|spawnSync)\b" src/
# Merges — prototype pollution
grep -rn "Object.assign\b\|structuredClone" src/
# Sampling — LLM-generated content flowing back into server logic
grep -rn "ctx.sample\|sampling/createMessage" src/
# Roots — client-shared filesystem
grep -rn "roots/list\|ctx.roots" src/
# Schema laxity — fields sneaking past validation
grep -rn "\.passthrough()\|\.catchall(" src/mcp-server/
Check:
file://, ftp://, localhost, DNS rebind?path.resolve + assert startsWith(root + sep))?__proto__, constructor, prototype keys?.strict() — unknown fields rejected, not silently passed to downstream code that destructures with ...rest?.passthrough() / .catchall() — no accidental exfiltration of fields your schema didn't declare?ctx.sample result) treated as untrusted input — schema-validated before reaching any other sink, never concatenated into prompts, shells, or queries?Smell: z.string().url() with no allowlist; readFile(input.path) with no canonicalization; await ctx.sample(...) result interpolated into a shell, SQL, or URL.
ctx.state is tenant-scoped. Module-scope state is not.
Look in: src/services/*.
grep -rnE "^(const|let) .* = new (Map|Set|WeakMap|Array)" src/services/
grep -rn "^let " src/services/
Check:
Map / Set / cache near tenant-handling code?logger while carrying per-tenant data (bypassing auto-correlated ctx.log)?Smell: service file with top-level const cache = new Map().
What accidentally reaches the LLM, user, or observability sinks.
Look in: throw new McpError(...) and ctx.fail(reason, msg, data) sites, error factory calls (notFound, httpErrorFromResponse, …), McpError.data fields (the data arg flows through both paths), output schemas, and every logging / telemetry surface — not just ctx.log.
grep -rnE "new McpError|ctx\.fail\(|httpErrorFromResponse\(" src/
grep -rnE "\b(ctx\.log|console\.(log|info|warn|error|debug)|logger\.)" src/
grep -rnE "(Sentry\.|captureException|setTag|setContext|addBreadcrumb)" src/
grep -rnE "(setAttribute|setAttributes|span\.)" src/ # OpenTelemetry
Check:
data fields (whether passed via ctx.fail(reason, msg, data), new McpError(code, msg, data), or factory calls) carry upstream response bodies, auth headers, stack traces?httpErrorFromResponse body capture sweeping in too much (default 500-byte cap is fine for most APIs but consider captureBody: false when the upstream returns auth-bearing payloads)?format() renders fields that shouldn't leave the server?ctx.log.info(msg, body) where body is the raw request (may contain secrets)?console.* calls near auth / token / request-body handling — bypasses structured redaction?=== or == instead of constant-time (timingSafeEqual / crypto.timingSafeEqual) — leaks length and prefix via timing?Smell: throw new McpError(code, upstream.message, { raw: upstream.body }) or throw ctx.fail('upstream_failed', e.message, { raw: e.response.body }). Or: if (apiKey === expected) on a request-auth path.
Unbounded = DoS of self, upstream, or the LLM's context window (billing-DoS is real).
Look in: handlers with loops, pagination, retries, or inputs that feed JSON.parse / schema validation.
grep -rnE "while\s*\(|for\s*\(.*of" src/mcp-server/tools/definitions/
grep -rn "cursor\|nextPage\|paginate" src/
grep -rn "JSON.parse\b" src/
Check:
0, null)?JSON.parse / Zod .parse() inputs have a size + nesting-depth limit applied before parse?delete_record 10k/sec hits you before it hits upstream)?Smell: while (cursor) { results.push(...); cursor = next; } with no max count. Or: JSON.parse(await req.text()) with no Content-Length check upstream.
CANVAS_PROVIDER_TYPE=duckdb)DataCanvas is opt-in and deliberately trades isolation for cross-agent token-shareable working sets — designed for public-data tabular servers (BrAPI, OpenAlex, etc.) where session-pinning isn't desired. The trade only holds when the deployment matches that assumption. Skip this axis entirely when canvas is disabled (CANVAS_PROVIDER_TYPE=none, the default).
Look in: src/config/server-config.ts, every tool reading ctx.core.canvas?, deployment config (wrangler / Dockerfile / proxy).
Check:
(tenantId, canvasId) scope collapses to ('default', canvasId) in MCP_AUTH_MODE=none — anyone with the canvasId attaches.CANVAS_MAX_CANVASES_PER_TENANT sized for the memory budget — default 100 is the floor; raising it lets a single tenant exhaust memory faster.CANVAS_TTL_MS / CANVAS_ABSOLUTE_CAP_MS not absurdly long. Defaults (24 h sliding / 7 d absolute) are reasonable; longer widens the window an unreferenced canvasId stays guessable.CANVAS_EXPORT_PATH doesn't point into a shared mount, the repo, or a directory another service serves from. The path-sandbox blocks .. traversal but doesn't prevent the configured root from being a bad choice.assertReadOnlyQuery), and Axis 7 (errors from canvas operations don't leak the failed SQL string back through McpError.data) all apply.Smell: MCP_AUTH_MODE=none deployment registering per-user data (recent activity, account state, cart contents) onto a canvas. Or: CANVAS_EXPORT_PATH=/srv/static with a static file server pointing at the same root.
Fast, sometimes high-leverage. Outside the eight axes.
bun audit — any direct high/critical?package.json — postinstall / lifecycle scripts on added deps?npm view <pkg> --json | jq .dist.attestations — missing attestation on a security-critical dep is a yellow flag.env.example — placeholder values only, never real?ConfigSchema — fails loudly on missing required keys (not silent defaults)?process.env.* reads outside the config parser (bypasses validation)?fuzzTool results from Step 1 — triage crashes / leaks as Axis 5 / Axis 8 findings.Three sections. Summary → findings → numbered options.
Definitions reviewed, axes covered, count by severity, the single most important finding.
Group by severity. Each 3–5 lines.
| Severity | Meaning |
|---|---|
| critical | Exploitable now: auth bypass, exfiltration, arbitrary code/file/network access |
| high | Structural gap with clear attacker benefit even without immediate PoC (destructive op without elicit, admin scope on read tool, SSRF-capable URL input) |
| medium | Defense-in-depth gap weakening a boundary (missing per-tenant rate limit, error carries upstream response) |
| low | Hardening / polish (tighter output schema, narrower error data, minor comment) |
Format:
**<file_or_tool> — Axis <N> — <critical|high|medium|low>**
Issue: <one line: what's wrong>
Impact: <one line: what can go wrong>
Fix: <one line: the change>
Numbered, cherry-pickable.
1. Add SSRF guard to `fetch_url.tool.ts` — block private IPs + non-http schemes (critical, #1)
2. Gate `delete_record.tool.ts` behind `ctx.elicit` (high, #3)
3. Split `admin` into `record:read` + `record:write` across 4 tools (high, #4)
4. Move `const tokenCache = new Map()` out of module scope in `auth-service.ts` (medium, #7)
5. Cap pagination loop in `list_all_tickets` at 1000 items (medium, #9)
6. Strip upstream response body from `McpError.data` in `sync-service.ts` (low, #11)
End with:
Pick by number (e.g. "do 1, 3, 5" or "expand on 2").
fuzzTool started in parallelctx.log / console.* / telemetry / constant-time comparisonsCANVAS_PROVIDER_TYPE=duckdb: Axis 9 — public-data assumption holds, external rate limiting in place, max-canvases-per-tenant + TTLs sized for the deployment, CANVAS_EXPORT_PATH doesn't escape into shared / served paths, assertReadOnlyQuery is the only SQL pathbun audit, lifecycle scripts, .env.example, config validation, new-dep provenance