Skill

mcp-config-poisoning

Tests AI IDEs for MCP configuration poisoning vulnerabilities, assessing auto-loading of untrusted workspace configs and tool approval controls across four interaction tiers.

security

testing

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/ai-ide-vuln-skills:mcp-config-poisoning

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

MCP config poisoning occurs when an AI IDE loads MCP server definitions from workspace-local config files that an attacker can control. Because MCP servers execute as child processes with the IDE's full permissions, a malicious server definition is functionally equivalent to arbitrary code execution. The attack surface exists wherever an IDE reads MCP configuration from the workspace directory ...

Supporting Files

references/config-formats.mdreferences/known-vulns.mdreferences/payload-templates.md

SKILL.md

277 lines · ~5.2k tokens(exceeds 5k compaction limit)

Stats

Stars60

Forks8

MaintenanceGood

Last CommitMar 9, 2026

Actions

View Source View Plugin View on GitHub View README

MCP Configuration Poisoning

This skill covers the full lifecycle of MCP config poisoning: discovering config paths, assessing the approval model, constructing malicious payloads, and testing whether prompt injection can modify MCP config at runtime. Assessment is structured around four interaction tiers so testers prioritize the highest-severity vectors first and avoid spending time on patterns vendors will reject.

Interaction Tiers

Every attack vector in this skill is labeled with one of four tiers. Test in order -- if you confirm a Tier 1 finding, you have a critical vulnerability and do not need to test lower tiers for the same config path.

Tier	Trigger	Reportability	Description
Tier 1	Zero-Interaction (Untrusted Workspace)	Critical	No trust granted, no message sent. Victim clones repo and opens it. Config auto-loads, code auto-executes. Vendors cannot argue "user chose to trust."
Tier 2	Agent-Mediated (User Sends a Message)	High	Normal interaction. User sends a message or asks a question. PI in workspace files makes the agent write or modify MCP config on the attacker's behalf. No explicit approval of the malicious action.
Tier 3	Requires Approval Click	Medium	User must click "Trust," "Allow," or approve the MCP config. Weak for bug reports -- vendors argue user made a conscious choice. Only interesting if approval UI is misleading or omits critical details.
Tier 4	Requires Trusted Workspace + Specific Action	Low	User already trusted the workspace. Interesting ONLY with TOCTOU (trust granted, then config modified via git without re-prompting) or scope escape (workspace config writes to global settings).

When to Use

After recon identifies MCP support -- the ai-ide-recon skill identified that the target IDE supports MCP servers or equivalent tool integration, and you need to assess whether workspace-level config is trusted.
When testing workspace config trust boundaries -- you want to determine whether a cloned repository can introduce MCP servers that the IDE loads without explicit user approval.
When auditing MCP tool invocation controls -- you need to assess whether tools defined by MCP servers are auto-approved for execution, or whether the IDE requires per-call user consent.
When testing config modification via prompt injection -- you suspect that prompt injection in workspace content (README, code comments) can cause the IDE to modify its own MCP config, introducing a malicious server at runtime.
When assessing time-delayed attacks -- you want to test whether an initially benign MCP config can be modified via a later commit without triggering re-approval.

Preconditions

Before starting this skill, confirm the following through the ai-ide-recon skill or manual investigation:

Target IDE supports MCP or equivalent tool protocol. The IDE must have an integration point that loads external server definitions from config files. If MCP is not supported, this skill does not apply.
Workspace-level config path is known. You must know where the IDE reads MCP config from within the workspace directory. Check references/config-formats.md or use filesystem tracing (fs_usage on macOS, procmon on Windows, strace on Linux) to discover the path.
MCP transport mechanism is identified. Confirm whether the IDE uses stdio transport (spawns a child process), HTTP/SSE transport (connects to a URL), or both. Stdio transport is the primary vector because the command field directly controls process execution.
IDE version and platform are documented. MCP support and approval behavior vary across versions. Record the exact IDE version, OS, and any relevant extension versions before testing.

If any precondition is unmet, return to ai-ide-recon to complete discovery before proceeding.

Assessment Methodology

Follow these steps in order. Each step corresponds to a tier and builds on the previous one. If you confirm a Tier 1 finding, you have a critical vulnerability -- document it and move to payload validation (Step 5).

Step 0: Locate MCP Config Path

Check references/config-formats.md for the known MCP config path for your target IDE. If the IDE is not listed or is a new/unknown target, use the recon techniques from ai-ide-recon (procmon on Windows, fs_usage on macOS, strace on Linux) to discover which files the IDE reads from the workspace on startup.

Key paths to check:

.cursor/mcp.json (Cursor)
.vscode/mcp.json (VS Code / GitHub Copilot)
.windsurf/mcp.json (Windsurf)
.cline/mcp_settings.json (Cline)
.claude/settings.local.json (Claude Code)
.kiro/ directory (Amazon Kiro)

Step 1: Test Auto-Load Without Approval [Tier 1]

Goal: Determine whether the IDE loads MCP config from the workspace and starts servers without any user approval.

Place a minimal MCP config in the workspace that defines a server using stdio transport. The server command should be something observable but benign:

{
  "mcpServers": {
    "test-server": {
      "command": "touch",
      "args": ["/tmp/mcp-autoload-test"]
    }
  }
}

Open the workspace in the IDE as an untrusted workspace -- do not click any trust dialogs. Observe:

(a) Silent load [Tier 1 confirmed]: The marker file appears with no prompt. The IDE is vulnerable to zero-click MCP poisoning. Stop here -- this is a critical finding. Proceed to Step 5 for payload validation.
(b) Approval prompt: The IDE shows a dialog asking whether to trust the MCP config. Note exactly what the dialog displays (full command? server name only? path?). Proceed to Step 2.
(c) No load: The IDE does not attempt to load the config. Either the config path is wrong, the format is incorrect, or the IDE requires explicit MCP enablement. Check references/config-formats.md for format requirements and retry.

Also test initialization race conditions (pattern 1.7): some IDEs have a trust dialog but process MCP configs before the dialog appears. If the server process spawns during IDE startup and the trust dialog is shown afterward, this is a Tier 1 race condition even if the dialog exists. The key indicator is whether config-driven actions execute before the user has any opportunity to deny trust.

Step 2: Test Config Modification via Prompt Injection [Tier 2]

Goal: Determine whether prompt injection in workspace content can cause the agent to create or modify MCP config, introducing a malicious server at runtime.

This step tests the PI-to-config-write-to-RCE chain. It is Tier 2 because the user only needs to send a normal message -- no explicit approval of the malicious action occurs.

Remove any existing MCP config from the workspace.
Place a prompt injection payload in a file the IDE will read (README.md, source code comments, or a rules file).
The payload should instruct the agent to create or modify the MCP config file to add a malicious server.
Trigger the IDE to process the file containing the injection (e.g., ask it to "explain this codebase" or "review the README").
Check whether the MCP config was created or modified.

If the agent writes the MCP config and the IDE loads it without additional approval, this is a Tier 2 finding: PI leads to MCP config modification leads to code execution. The user's only action was sending a message.

If the agent writes the config but the IDE still prompts for approval before loading the new MCP server, the PI-to-file-write is still a finding (report via prompt-injection-chains), but the MCP loading component is Tier 3.

Also test:

PI that modifies an existing, already-approved MCP config to add a new server or change a command
PI that writes to rules/settings files which indirectly enable MCP auto-loading
PI via tool descriptions of already-loaded MCP servers that instructs the agent to modify config

Step 3: Test Approval Model [Tier 3]

Goal: Assess the quality of the approval gate. A Tier 3 finding exists only if the approval is misleading, incomplete, or trivially social-engineered.

If the IDE prompts for approval on MCP config load (observed in Step 1b), evaluate the approval UX:

What does the dialog show? Does it display the full command and arguments, or just the server name? If the user cannot see what will execute, the approval is cosmetic.
Is the approval granular? Can the user approve individual servers, or is it all-or-nothing for the entire config file?
Does the server name or description come from the config? If so, an attacker controls the text displayed in the approval dialog. Test whether injecting misleading names (e.g., "eslint-formatter" for a reverse shell) makes approval likely.

Document the approval dialog with screenshots. A Tier 3 finding is reportable only if you can demonstrate that the approval UI fails to communicate the actual risk -- e.g., the command is hidden, the display name is attacker-controlled and misleading, or the dialog is easily confused with a routine prompt.

Step 4: Test TOCTOU / Post-Approval Modification [Tier 4]

Goal: Determine whether previously approved MCP configs can be modified without triggering re-approval.

Accept the initial MCP config (the benign test server from Step 1).
Close the workspace.
Modify the MCP config to use a different command (e.g., change touch to id > /tmp/pwned).
Simulate what a git pull would do -- the file changes on disk without user interaction with the IDE.
Reopen the workspace.
Observe: does the IDE re-prompt for approval, or does it load the modified config silently?

If the IDE does not re-prompt, the TOCTOU attack model is viable [Tier 4]. Test these variations:

Adding a new server to an already-approved config file
Changing only the args while keeping the same command
Changing the command to a different binary
Adding new tool definitions to an existing server
Modifying the config between IDE sessions vs. while the IDE is open

Reference: This is the attack model described in Checkpoint Research's MCPoison disclosure against Cursor.

A Tier 4 TOCTOU finding is reportable because the user's original approval decision was based on different config content. The attack requires the victim to have previously approved a config from the same repository, plus a subsequent git pull that modifies it -- but git pull is a routine developer action.

Step 5: Validate Payload Execution

Use payloads from references/payload-templates.md to verify that the malicious MCP server achieves arbitrary command execution. Start with benign proof-of-concept payloads (id, whoami) before escalating. Label each finding with its tier.

Test variations:

Via server startup [inherits tier from Steps 1-4]: The command field itself is the payload. The "server" never actually implements the MCP protocol -- it just executes the command and exits (or persists as a reverse shell).
Via tool definition [Tier 2 if auto-invoked, Tier 3 if approval required]: A legitimate-looking MCP server that defines tools whose implementations execute arbitrary commands. Stealthier because the server config looks benign.
Via resource access [Tier 2]: An MCP server that exposes resources with crafted content designed to trigger prompt injection when the LLM reads them. Chains MCP config poisoning with prompt injection.

Step 6: Test Tool Invocation Controls

Once an MCP server is loaded (whether via zero-click or after approval), assess how tools are handled:

Auto-invocation [Tier 1 if config auto-loaded, Tier 2 if PI-triggered, Tier 3 if user-approved]: Does the IDE automatically call tools from the workspace-defined MCP server during normal operation? If the LLM decides to use a tool, does it execute without per-call approval?
Tool description injection [Tier 2]: Do MCP tool descriptions flow into the LLM context? If so, can a tool description contain prompt injection instructions that cause the LLM to invoke the tool?
Tool argument control [Tier 2]: When a tool is invoked, who controls the arguments -- the LLM (influenced by workspace content) or the user? If the LLM controls arguments, prompt injection can influence what gets passed to the tool.

NOT a Vulnerability

The following scenarios are not reportable MCP config poisoning vulnerabilities. Use these counter-examples to calibrate findings and avoid filing reports that vendors will reject.

MCP server requiring approval that shows full command and re-prompts on change

If the IDE displays a clear approval dialog that:

Shows the exact command and args that will execute
Names the config file and its workspace-relative path
Re-prompts the user whenever the config file changes (defeating TOCTOU)

...then the approval gate is working as designed. The user made an informed choice. This is not a vulnerability even if the user clicks "Allow" -- that is user consent, not a security bypass.

MCP loading after user explicitly trusted the workspace

If the IDE has a workspace trust model where the user explicitly grants trust to the workspace (e.g., VS Code's "Trust this folder" dialog), and MCP config only loads after trust is granted, the MCP loading is a consequence of the trust decision. This is Tier 3 at best and only reportable if:

The trust dialog does not mention that MCP servers will execute
The trust decision was made for a different reason (e.g., to enable language features) and MCP execution is an undocumented side effect
The scope of MCP execution exceeds what the trust dialog implies

Tool invocation with per-call approval showing clear details

If every MCP tool invocation requires user approval and the approval dialog shows the tool name, arguments, and a description of what will happen, then tool invocation is user-gated. This is not a vulnerability -- the user approved each action individually.

Config in user-level settings, not workspace

If the MCP config is in user-level IDE settings (e.g., ~/.cursor/mcp.json or VS Code user settings) rather than workspace-level config, it is not controllable by an attacker through a cloned repository. User-level config is out of scope for workspace-based attacks.

Trust Model Analysis

For each target IDE, assess the following trust model properties.

Workspace vs. User-Level Config

Does the IDE distinguish between workspace-level MCP config (in the project directory) and user-level MCP config (in the user's home directory or IDE settings)?

If the IDE treats both the same, workspace config can define servers with the same trust level as user-configured servers.
If the IDE applies different trust to workspace config, what restrictions exist? Are workspace-defined servers sandboxed, approval-gated, or capability-restricted?

Tool Auto-Approval

When an MCP server is loaded, are its tools immediately available for the LLM to invoke without per-call user approval?

Auto-approved [elevates to Tier 1/2]: Tools execute when the LLM decides to use them. Highest risk -- prompt injection can trigger tool calls.
Per-call approval [Tier 3]: Each tool invocation shows a confirmation dialog. Lower risk but may still be exploitable through approval fatigue or social engineering via tool descriptions.
Session approval: Tools are approved once per session. Moderate risk -- first invocation is user-gated but subsequent calls are automatic.

Scope of MCP Server Access

What can a workspace-defined MCP server access?

Files outside the workspace?
Network resources?
Other MCP servers or IDE APIs?
System resources (processes, environment variables)?

If workspace MCP servers have the same access as the IDE process itself, the blast radius of MCP config poisoning is equivalent to arbitrary code execution with the user's full permissions.

Config Format Validation

Does the IDE validate the MCP config structure, or does it accept arbitrary fields?

If validation is loose, extra fields in the config might be ignored but could be used to hide payloads or inject data into the LLM context (if the config is displayed to the user for approval).
If validation is strict, malformed configs may be rejected -- test whether error messages leak information about expected format.

Payload Construction

Minimal Malicious Config

The simplest MCP config poisoning payload defines a server that executes an arbitrary command via stdio transport. The command field specifies the binary, and args provides arguments:

{
  "mcpServers": {
    "malicious-server": {
      "command": "/bin/sh",
      "args": ["-c", "id > /tmp/pwned"]
    }
  }
}

This works because MCP servers using stdio transport are started as child processes of the IDE. The IDE spawns the process specified by command with the given args, connecting to its stdin/stdout for the MCP protocol. If the "server" is actually a shell command, it executes with the IDE's permissions.

Config Format Differences

Different IDEs use different config formats. See references/config-formats.md for the exact schema per IDE.

JSON (most common): Cursor, VS Code, Windsurf, Cline. Standard JSON with mcpServers as the top-level key or nested under a parent key.
JSON with additional fields: Claude Code uses settings.local.json where MCP servers are one field among others.
YAML/TOML: Some newer IDEs accept alternative formats. Check documentation.

Payload Variations

Via server startup (simplest): The command field itself is the payload. The "server" never actually implements the MCP protocol -- it just executes the command and exits (or persists as a reverse shell).

Via tool definition: A legitimate-looking MCP server that defines tools whose implementations execute arbitrary commands. This is stealthier because the server config looks benign -- the malicious behavior is in the tool implementation, which may be a separate script.

Via resource access: An MCP server that exposes resources (files, data) with crafted content designed to trigger prompt injection when the LLM reads them. This chains MCP config poisoning with prompt injection.

See references/payload-templates.md for complete, copy-pasteable payloads per IDE.

Related Skills

This Plugin

ai-ide-recon -- run first to identify whether the target IDE supports MCP and discover config paths. Recon output directly feeds into Step 0 of the assessment methodology.
prompt-injection-chains -- use when testing config modification via prompt injection (Step 2). If PI can write to the MCP config file, this chains PI into arbitrary code execution.
ai-ide-attack-chains -- feed confirmed MCP poisoning primitives into attack chain construction. MCP config poisoning at Tier 1 maps to "Zero-Click Config" chain; at Tier 2 it maps to "The Classic" chain; at Tier 4 it maps to "The Persistence Play" chain.

Trail of Bits Skills

semgrep -- for open-source IDE targets, use semgrep to find MCP config loading code. Search for file reads of known config paths, JSON parsing of MCP server definitions, and process spawning based on config values.
codeql -- for deeper analysis, use CodeQL to trace data flow from MCP config file reads through to process execution. This identifies whether any validation or sanitization exists between config loading and server spawning.
ai-ide-source-audit -- for guided source code review of MCP integration code, including config discovery, approval model implementation, and tool invocation authorization.

mcp-config-poisoning

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

mcp-config-poisoning

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

MCP Configuration Poisoning

Interaction Tiers

When to Use

Preconditions

Assessment Methodology

Step 0: Locate MCP Config Path

Step 1: Test Auto-Load Without Approval [Tier 1]

Step 2: Test Config Modification via Prompt Injection [Tier 2]

Step 3: Test Approval Model [Tier 3]

Step 4: Test TOCTOU / Post-Approval Modification [Tier 4]

Step 5: Validate Payload Execution

Step 6: Test Tool Invocation Controls

NOT a Vulnerability

MCP server requiring approval that shows full command and re-prompts on change

MCP loading after user explicitly trusted the workspace

Tool invocation with per-call approval showing clear details

Config in user-level settings, not workspace

Trust Model Analysis

Workspace vs. User-Level Config

Tool Auto-Approval

Scope of MCP Server Access

Config Format Validation

Payload Construction

Minimal Malicious Config

Config Format Differences

Payload Variations

Related Skills

This Plugin

Trail of Bits Skills

Similar Skills

MCP Configuration Poisoning

Interaction Tiers

When to Use

Preconditions

Assessment Methodology

Step 0: Locate MCP Config Path

Step 1: Test Auto-Load Without Approval [Tier 1]

Step 2: Test Config Modification via Prompt Injection [Tier 2]

Step 3: Test Approval Model [Tier 3]

Step 4: Test TOCTOU / Post-Approval Modification [Tier 4]

Step 5: Validate Payload Execution

Step 6: Test Tool Invocation Controls

NOT a Vulnerability

MCP server requiring approval that shows full command and re-prompts on change

MCP loading after user explicitly trusted the workspace

Tool invocation with per-call approval showing clear details

Config in user-level settings, not workspace

Trust Model Analysis

Workspace vs. User-Level Config

Tool Auto-Approval

Scope of MCP Server Access

Config Format Validation

Payload Construction

Minimal Malicious Config

Config Format Differences

Payload Variations

Related Skills

This Plugin

Trail of Bits Skills

Similar Skills