From ai-ide-vuln-skills
Maps attack surface of AI-assisted IDEs with interaction tiers prioritizing zero-click vectors; analyzes docs for blind spots, enumerates configs for security testing.
npx claudepluginhub mindgard/ai-ide-skills --plugin ai-ide-vuln-skillsThis skill uses the workspace's default tool permissions.
This skill maps the attack surface of an AI-assisted IDE or coding agent before pattern-specific vulnerability testing begins. It produces a tier-annotated list of features to probe, identifies documentation blind spots that signal weak or absent security controls, and discovers config files and auto-load paths that may be abusable. Run this skill first for every new target -- the output direct...
Audits source code of open-source AI IDEs for security vulnerabilities in command filtering, MCP integration, and file-write permissions. Use after recon for targeted reviews.
Scans agentic configs (.github/, .vscode/) using AgentShield's 33-rule taxonomy and application source code for OWASP Top 10 + STRIDE threats.
Scans local projects for dependency vulnerabilities (SCA), code security patterns (SAST), leaked secrets, auth/crypto flaws, misconfigs, supply chain risks, CI/CD issues. Generates prioritized report with remediation guidance.
Share bugs, ideas, or general feedback.
This skill maps the attack surface of an AI-assisted IDE or coding agent before pattern-specific vulnerability testing begins. It produces a tier-annotated list of features to probe, identifies documentation blind spots that signal weak or absent security controls, and discovers config files and auto-load paths that may be abusable. Run this skill first for every new target -- the output directly feeds into the pattern-specific skills (MCP poisoning, terminal filter bypass, data exfiltration, etc.) so you test the highest-severity surfaces first instead of spraying payloads at random.
Every feature discovered during recon is classified into one of four interaction tiers. The tier determines how much user involvement an attacker needs to exploit the feature, which directly maps to report severity and vendor acceptance likelihood. Tier 1 findings go to the top of the queue.
All recon output is annotated with these tiers. Test in order -- Tier 1 first, Tier 4 last.
| Tier | Label | Trigger Model | Severity | Reportability |
|---|---|---|---|---|
| Tier 1 | Zero-Interaction (Untrusted Workspace) | No trust granted, no message sent. Victim clones repo and opens it. Config auto-loads, code auto-executes, race condition fires before trust dialog. | Critical | Highest -- vendors cannot argue "user chose to trust" |
| Tier 2 | Agent-Mediated (User Sends a Message) | Trusted or untrusted workspace. User interacts normally -- sends a message, asks a question. PI in workspace files makes the agent act on attacker's behalf. No explicit approval of the malicious action. | High | Strong -- normal developer workflow is the trigger |
| Tier 3 | Requires Approval Click | User must click "Trust," "Allow," or approve a specific action. | Medium | Weak -- vendors argue user made a conscious choice. Only interesting if approval is misleading, UI is confusing, or social engineering is trivial |
| Tier 4 | Requires Trusted Workspace + Specific Action | User already trusted the project and must take a specific action. | Low standalone | Weakest standalone. Interesting ONLY with TOCTOU (trust granted then config modified via git), scope escape (workspace action writes global config), or routine-action guarantee (git pull) |
When classifying a feature, use these rules:
Before starting, classify your target. This determines the workflow:
| Target Type | What You Have | Workflow |
|---|---|---|
| Open-source | Full source code access (e.g., Cline, Continue, parts of Cursor) | Docs analysis --> source audit (use ai-ide-source-audit) --> pattern skills |
| Closed-source | Documentation + binary only (e.g., Copilot backend, Cursor core) | Docs analysis --> black-box enumeration --> pattern skills |
| Hybrid | Partial source -- extensions open, core closed (e.g., VS Code + Copilot extension) | Combine both workflows: source audit on the open parts, black-box on the closed core |
For hybrid targets, prioritize the boundary between open and closed components. Extension APIs that bridge into the closed core are often the richest attack surface because the extension side may expose capabilities the core didn't intend to be externally accessible.
The goal is to extract every security-relevant feature from the target's documentation and map each one to the vulnerability taxonomy and its interaction tier. This is not a casual read -- you are systematically identifying every mechanism that could be abused.
For each of the following, search the IDE's documentation, blog posts, and configuration references. Annotate every feature with its likely tier using the rules above.
MCP / Tool Integration
File-Write Permissions Model
Command Execution Controls
Prompt Template / Rules Loading
.cursorrules, CLAUDE.md, .clinerules, .github/copilot-instructions.md)?Output Rendering
Workspace Trust Model
Hooks / Lifecycle Events
Agent / Auto-Run Mode
Local Network Services
Access-Control-Allow-Origin: *)?lsof -nP -iTCP -sTCP:LISTEN before and after launching the IDE. Any new listeners are candidates. Probe each with unauthenticated requests and cross-origin requests from a browser tab.For each feature identified, record:
See references/doc-analysis-checklist.md for the full structured checklist with checkboxes.
This is the core technique for closed-source assessment and the highest-value output of recon. The principle: what the documentation does not mention is what you should test first.
IDE vendors document features they are proud of and security controls they want users to know about. Features that lack security controls tend to be undocumented or vaguely described. This asymmetry is your signal.
| Documentation Gap | Likely Tier | What It Likely Means | Skill to Use |
|---|---|---|---|
| No mention of MCP config validation or trust model | Tier 1 if workspace config auto-loads; Tier 2 if PI-driven | MCP configs from the workspace are likely loaded without verification | mcp-config-poisoning |
| No mention of hook security or approval | Tier 1 if hooks fire on open; Tier 2 if agent-triggered | Hooks may auto-execute from workspace config | ai-ide-code-exec |
| No mention of workspace trust or file-write restrictions | Tier 2 (PI needed to trigger write) | The agent can likely write to any file in the workspace, including config files | prompt-injection-chains |
| No mention of command allowlists or filtering | Tier 2 (agent runs commands during conversation) | Terminal commands are likely passed through with minimal filtering, or filtering is LLM-based (bypassable) | terminal-filter-bypass |
| No mention of rules file validation | Tier 1 if auto-loaded on open; Tier 2 if loaded on first message | Auto-loaded instruction files can likely inject arbitrary prompts | prompt-injection-chains |
| No mention of output sanitization or CSP | Tier 2 (renders during agent response) | Rendered output may allow image loading, Mermaid URLs, or webview exploitation | ai-ide-data-exfil |
| No mention of agent mode restrictions | Tier 2 (agent mode during conversation) | Agent/auto-run mode likely has the same capabilities as interactive mode, minus the approval prompts | ai-ide-code-exec |
When multiple gaps appear together, the target is likely vulnerable to attack chains. The most dangerous combination:
Use ai-ide-attack-chains to model these compound vulnerabilities.
When you lack source code, use runtime observation to discover config files and auto-load paths the documentation doesn't mention. Pay special attention to anything loaded before any trust dialog -- these are Tier 1 candidates.
Windows -- Process Monitor (Procmon)
.cursor/, .vscode/, .cline/, etc.).macOS -- fs_usage or Instruments
sudo fs_usage -w -f filesys <PID> where <PID> is the IDE process.open() and stat() calls within the workspace directory.stat()-checked even if they don't exist -- the IDE is looking for them, which means creating them may influence behavior. Correlate timestamps with the trust dialog: reads before the dialog = Tier 1.Linux -- strace
strace -e trace=openat,stat,statx -f -p <PID> 2>&1 | grep /path/to/workspace-f to follow child processes (IDEs often spawn language servers and helper processes).ENOENT results on dotfiles in the workspace -- these are paths the IDE checks for but that don't exist in your test workspace.Create a test workspace that contains plausible config directories and files, then observe which ones the IDE reads:
test-workspace/
.cursor/mcp.json
.vscode/settings.json
.vscode/mcp.json
.cline/mcp_settings.json
.windsurf/cascade.json
.claude/settings.local.json
.github/copilot-instructions.md
.cursorrules
.clinerules
.windsurfrules
CLAUDE.md
.kiro/
.devin/
.codex/
For each file the IDE reads, test whether modifying its content changes IDE behavior. Start with benign modifications (change a display setting), then escalate to security-relevant modifications (add an MCP server, modify command allowlists). Record whether the modification takes effect before any trust dialog (Tier 1) or only after user interaction (Tier 2+).
See references/ide-config-locations.md for the complete table of known config paths.
Beyond known paths, look for:
MCP_SERVER_PATH, tool search paths).Recon produces a structured, tier-annotated attack surface map that drives all subsequent testing. The map should contain:
Order features by tier first, then by security control quality within each tier:
Tier 1 -- Test Immediately Features that fire without any trust or user interaction. These are the highest-severity findings. List every auto-loaded config path, every race condition window, every pre-trust-dialog file read.
Tier 2 -- Test Next Features exploitable through prompt injection during normal user interaction. No separate approval click for the malicious action. List every PI-drivable capability: file writes, command execution, tool invocation.
Tier 3 -- Test If Time Permits Features requiring an explicit approval click. Only worth testing if the approval UX is misleading or the displayed information differs from what actually executes. Note the specific approval prompt for each.
Tier 4 -- Test for TOCTOU and Scope Escape Only Features requiring an already-trusted workspace plus a specific action. Test only for: (a) configs modifiable via git after trust was granted (TOCTOU), (b) workspace actions that write to global/user-level config (scope escape), (c) actions so routine they are guaranteed (git pull, file save).
Within each tier, sub-prioritize by security control quality:
For each identified feature, record tier and the pattern-specific skill to use:
| Feature Category | Typical Tier | Relevant Skill |
|---|---|---|
| Config auto-load on workspace open (MCP, hooks, tools) | Tier 1 | mcp-config-poisoning, ai-ide-code-exec |
| Rules file auto-load on open | Tier 1 | prompt-injection-chains |
| Initialization race condition (fires before trust dialog) | Tier 1 | ai-ide-code-exec |
| Binary planting in workspace PATH | Tier 1 | ai-ide-code-exec |
| PI-driven file write to config | Tier 2 | prompt-injection-chains, ai-ide-attack-chains |
| PI-driven command execution | Tier 2 | terminal-filter-bypass |
| Markdown/Mermaid rendering in agent response | Tier 2 | ai-ide-data-exfil |
| MCP tool description injection | Tier 2 | mcp-config-poisoning |
| MCP server requiring explicit approval click | Tier 3 | mcp-config-poisoning (test for misleading UX) |
| Terminal command requiring "Allow" click | Tier 3 | terminal-filter-bypass (test display vs. execution mismatch) |
| Hooks in trusted workspace, no TOCTOU | Tier 4 | ai-ide-code-exec (test for TOCTOU or scope escape) |
| Source code available | Any | ai-ide-source-audit |
| Multiple features chainable | Any | ai-ide-attack-chains |
This is one of several critical gate assessments. See the Security Gates section in the README for the full gate model. Answer:
.cursor/mcp.json or .vscode/settings.json are almost always in dotdirectories.If file writes are unrestricted, most PI-driven attack chains become viable. However, file-write status is one of several security gates -- workspace config approval, initialization safety, trust integrity (TOCTOU), and outbound channel controls each independently block different chain types. See the README for the full gate model.
Not every discovered feature or behavior is a security issue. The following are explicitly out of scope or not reportable as vulnerabilities. Recognizing these early avoids wasting time and maintains credibility with vendors.
User explicitly approved the exact action that executed. If the IDE shows a clear prompt naming the command, file path, or MCP server, and the user clicks "Allow," the approval gate worked as designed. This is Tier 3 functioning correctly. Exception: the displayed action differs from what actually executes (display vs. execution mismatch IS a vulnerability).
Behavior only in a workspace the user has already explicitly trusted, with no TOCTOU. If the user granted workspace trust through a clear dialog and the attacker cannot modify configs after trust was granted (no git-based TOCTOU, no scope escape), then execution within that trust boundary is by design. This is Tier 4 functioning correctly.
Prompt injection that the agent correctly refuses. If PI in workspace files attempts to make the agent take a malicious action and the agent refuses or asks for confirmation, the defense worked. Only report if you can bypass the refusal.
Data the user intentionally sent to the model. If a user pastes code into the chat and the model processes it, the user chose to share that data. This is not exfiltration.
Features that require the attacker to already have code execution on the victim's machine. If the attack prerequisite is "attacker can write arbitrary files to the victim's filesystem," you already have a more severe vulnerability than anything the IDE adds. The exception is supply-chain scenarios where a dependency or git repo is the delivery mechanism -- those are in scope.
Theoretical prompt injection without demonstrated impact. Showing that a rules file is loaded is not itself a vulnerability. You must demonstrate that the injected instructions cause the agent to take a harmful action (file write, command execution, data exfiltration) that the user did not intend and did not approve.
Self-inflicted configuration. If a developer adds a malicious MCP server to their own user-level config, they compromised themselves. Workspace-level configs planted by a third party (via a cloned repo) ARE in scope; user-level configs the user edited themselves are not.
After recon, proceed to pattern-specific testing based on your tier-annotated attack surface map. Start with skills targeting your Tier 1 findings, then Tier 2: