Search everything...

Skill

ai-ide-recon

Maps attack surface of AI-assisted IDEs with interaction tiers prioritizing zero-click vectors; analyzes docs for blind spots, enumerates configs for security testing.

security

npx claudepluginhub mindgard/ai-ide-skills --plugin ai-ide-vuln-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Supporting Assets

references/changelog-keywords.mdreferences/doc-analysis-checklist.mdreferences/ide-config-locations.md

SKILL.md

Similar Skills

ai-ide-source-audit

Audits source code of open-source AI IDEs for security vulnerabilities in command filtering, MCP integration, and file-write permissions. Use after recon for targeted reviews.

3 files

ai-ide-vuln-skills

atv-security

Scans agentic configs (.github/, .vscode/) using AgentShield's 33-rule taxonomy and application source code for OWASP Top 10 + STRIDE threats.

atv-starter-kit

cyber-neo

Scans local projects for dependency vulnerabilities (SCA), code security patterns (SAST), leaked secrets, auth/crypto flaws, misconfigs, supply chain risks, CI/CD issues. Generates prioritized report with remediation guidance.

16 files17 tools

cyber-neo

Stats

Stars53

Forks6

Last CommitMar 3, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

ai-ide-recon | ai-ide-vuln-skills | ClaudePluginHub

Back to Skills

Skill

ai-ide-recon

From ai-ide-vuln-skills

Maps attack surface of AI-assisted IDEs with interaction tiers prioritizing zero-click vectors; analyzes docs for blind spots, enumerates configs for security testing.

security

npx claudepluginhub mindgard/ai-ide-skills --plugin ai-ide-vuln-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Supporting Assets

references/changelog-keywords.mdreferences/doc-analysis-checklist.mdreferences/ide-config-locations.md

SKILL.md

AI IDE Reconnaissance

This skill maps the attack surface of an AI-assisted IDE or coding agent before pattern-specific vulnerability testing begins. It produces a tier-annotated list of features to probe, identifies documentation blind spots that signal weak or absent security controls, and discovers config files and auto-load paths that may be abusable. Run this skill first for every new target -- the output directly feeds into the pattern-specific skills (MCP poisoning, terminal filter bypass, data exfiltration, etc.) so you test the highest-severity surfaces first instead of spraying payloads at random.

Every feature discovered during recon is classified into one of four interaction tiers. The tier determines how much user involvement an attacker needs to exploit the feature, which directly maps to report severity and vendor acceptance likelihood. Tier 1 findings go to the top of the queue.

Interaction Tiers

All recon output is annotated with these tiers. Test in order -- Tier 1 first, Tier 4 last.

Tier	Label	Trigger Model	Severity	Reportability
Tier 1	Zero-Interaction (Untrusted Workspace)	No trust granted, no message sent. Victim clones repo and opens it. Config auto-loads, code auto-executes, race condition fires before trust dialog.	Critical	Highest -- vendors cannot argue "user chose to trust"
Tier 2	Agent-Mediated (User Sends a Message)	Trusted or untrusted workspace. User interacts normally -- sends a message, asks a question. PI in workspace files makes the agent act on attacker's behalf. No explicit approval of the malicious action.	High	Strong -- normal developer workflow is the trigger
Tier 3	Requires Approval Click	User must click "Trust," "Allow," or approve a specific action.	Medium	Weak -- vendors argue user made a conscious choice. Only interesting if approval is misleading, UI is confusing, or social engineering is trivial
Tier 4	Requires Trusted Workspace + Specific Action	User already trusted the project and must take a specific action.	Low standalone	Weakest standalone. Interesting ONLY with TOCTOU (trust granted then config modified via git), scope escape (workspace action writes global config), or routine-action guarantee (git pull)

Tier Boundary Rules

When classifying a feature, use these rules:

If the feature fires before any trust dialog or user message, it is Tier 1.
If the feature fires because the agent processes workspace content during a normal user interaction (no separate approval for the malicious action), it is Tier 2.
If the feature fires only after the user explicitly clicks an approval prompt that names the action, it is Tier 3.
If the feature fires only in an already-trusted workspace and requires a specific user-initiated action, it is Tier 4 -- unless TOCTOU or scope escape applies, which elevates it.

When to Use

Starting any AI IDE security assessment -- before writing a single payload, you need to know what the target supports, what it auto-loads, and where its trust boundaries are.
Evaluating a new IDE for the first time -- when a new AI coding assistant launches or a major version ships, recon tells you what changed and what attack surface was introduced.
Analyzing documentation before black-box testing -- extracting security-relevant information from docs is faster than reverse engineering, and the gaps in documentation are themselves a signal.
Enumerating config files and auto-load paths -- discovering which files the IDE reads from the workspace on startup, and which of those can trigger code execution or modify agent behavior.
Monitoring changelogs for new attack surface -- when an IDE ships an update, use the changelog keyword reference to quickly identify security-relevant changes.

Target Classification

Before starting, classify your target. This determines the workflow:

Target Type	What You Have	Workflow
Open-source	Full source code access (e.g., Cline, Continue, parts of Cursor)	Docs analysis --> source audit (use ai-ide-source-audit) --> pattern skills
Closed-source	Documentation + binary only (e.g., Copilot backend, Cursor core)	Docs analysis --> black-box enumeration --> pattern skills
Hybrid	Partial source -- extensions open, core closed (e.g., VS Code + Copilot extension)	Combine both workflows: source audit on the open parts, black-box on the closed core

For hybrid targets, prioritize the boundary between open and closed components. Extension APIs that bridge into the closed core are often the richest attack surface because the extension side may expose capabilities the core didn't intend to be externally accessible.

Documentation Analysis

The goal is to extract every security-relevant feature from the target's documentation and map each one to the vulnerability taxonomy and its interaction tier. This is not a casual read -- you are systematically identifying every mechanism that could be abused.

Features to Identify

For each of the following, search the IDE's documentation, blog posts, and configuration references. Annotate every feature with its likely tier using the rules above.

MCP / Tool Integration

Does the IDE support MCP (Model Context Protocol) servers or equivalent tool integrations?
Where is the MCP configuration stored? Per-workspace, per-user, or both?
What approval model exists for new MCP servers? Auto-approve, prompt-once, prompt-every-time?
Can MCP tool descriptions contain instructions that influence the LLM?
Tier signal: Per-workspace MCP config that auto-loads without approval = Tier 1. MCP config that activates when the agent processes a task = Tier 2. MCP requiring explicit "Trust this server" click = Tier 3.

File-Write Permissions Model

Can the AI agent write files without user approval?
Is there a distinction between creating new files vs. modifying existing ones?
Are there path restrictions (workspace-only, no dotfiles, etc.)?
What happens when the agent writes to config files that the IDE itself reads?
Tier signal: Auto-write to config files on workspace open = Tier 1. PI-driven file write with no per-write approval = Tier 2. Each write requires explicit approval = Tier 3.

Command Execution Controls

How does the IDE filter or approve terminal commands?
Is there an allowlist, blocklist, or LLM-based classification?
What shell is used? Is the shell configurable?
Are there known bypass patterns for the filtering mechanism?
Tier signal: Commands auto-executed from workspace config (hooks, build tasks) = Tier 1. Commands run by the agent during normal conversation = Tier 2. Commands requiring explicit user "Run" click = Tier 3.

Prompt Template / Rules Loading

Does the IDE auto-load instruction files from the workspace (.cursorrules, CLAUDE.md, .clinerules, .github/copilot-instructions.md)?
Can these files override safety instructions or system prompts?
Is there a trust model for these files (e.g., require user approval on first load)?
Tier signal: Rules auto-loaded silently on workspace open = Tier 1. Rules loaded when user sends first message = Tier 2. Rules requiring approval prompt = Tier 3.

Output Rendering

Does the IDE render markdown in responses? If so, does it render images (potential exfil via URL)?
Is Mermaid diagram rendering supported? (Mermaid can embed URLs)
Are webviews used? If so, what Content Security Policy is applied?
Can rendered output trigger navigation or resource loading?
Tier signal: Auto-rendered output from workspace content without user message = Tier 1 (rare). Rendered during agent response to user query = Tier 2. Rendered only after user clicks to expand or preview = Tier 3.

Workspace Trust Model

Does the IDE have a workspace trust concept (like VS Code's Workspace Trust)?
What features are restricted in untrusted workspaces?
Can trust be escalated from within the workspace (e.g., by modifying a settings file)?
Tier signal: No workspace trust concept at all (everything auto-loads) = every auto-load feature is Tier 1. Trust dialog present but bypassable or raceable = Tier 1. Trust dialog solid = features behind it are Tier 3 or Tier 4.

Hooks / Lifecycle Events

Does the IDE support pre/post hooks for operations (save, commit, build)?
Where are hooks configured? Can workspace-level config define hooks?
Are hooks executed automatically or do they require approval?
Tier signal: Hooks from workspace config that fire on open or first save = Tier 1. Hooks that fire when agent performs an action during conversation = Tier 2. Hooks requiring explicit approval = Tier 3. Hooks only in trusted workspaces = Tier 4 (unless TOCTOU).

Agent / Auto-Run Mode

Does the IDE have an autonomous or "agent" mode that executes without per-action approval?
What capabilities are available in agent mode vs. interactive mode?
Is there a sandbox or capability restriction for agent mode?
Tier signal: Agent mode auto-starts on workspace open = Tier 1. Agent mode activated by user sending a message = Tier 2. Agent mode requiring explicit opt-in = Tier 3.

Local Network Services

Does the IDE expose HTTP servers on localhost during operation?
Are these authenticated? Do they require tokens, API keys, or session cookies?
Do they have permissive CORS headers (e.g., Access-Control-Allow-Origin: *)?
What endpoints are exposed and what operations do they permit?
Can a malicious web page or local process interact with these services?
Test methodology: Compare lsof -nP -iTCP -sTCP:LISTEN before and after launching the IDE. Any new listeners are candidates. Probe each with unauthenticated requests and cross-origin requests from a browser tab.
Tier signal: Unauthenticated local service with permissive CORS that accepts commands = Tier 1 (any local process or malicious web page can exploit). Authenticated service with restrictive CORS = Tier 3 or higher. Service only listening on a Unix socket = lower risk but still worth documenting.

Recording Findings

For each feature identified, record:

Interaction tier -- Tier 1, 2, 3, or 4 based on the classification rules
Configuration mechanism -- file path and format, UI setting, or API
Approval model -- none, one-time, per-session, per-action
Trust boundary crossed -- does this feature allow workspace content to influence IDE behavior, execute code, or access resources outside the workspace?

See references/doc-analysis-checklist.md for the full structured checklist with checkboxes.

Blind Spot Detection

This is the core technique for closed-source assessment and the highest-value output of recon. The principle: what the documentation does not mention is what you should test first.

IDE vendors document features they are proud of and security controls they want users to know about. Features that lack security controls tend to be undocumented or vaguely described. This asymmetry is your signal.

Method

Take the complete vulnerability taxonomy (4 classes, ~24 patterns) as your reference list.
For each pattern, search the target's documentation for evidence of awareness and mitigation.
Score each pattern: documented and mitigated, mentioned but vague, or not mentioned.
Assign a likely tier to each gap based on what trigger model the pattern requires.
Prioritize testing: Tier 1 "not mentioned" patterns first, then Tier 2 "not mentioned," then Tier 1 "vague," and so on.

Specific Gaps and What They Signal

Documentation Gap	Likely Tier	What It Likely Means	Skill to Use
No mention of MCP config validation or trust model	Tier 1 if workspace config auto-loads; Tier 2 if PI-driven	MCP configs from the workspace are likely loaded without verification	mcp-config-poisoning
No mention of hook security or approval	Tier 1 if hooks fire on open; Tier 2 if agent-triggered	Hooks may auto-execute from workspace config	ai-ide-code-exec
No mention of workspace trust or file-write restrictions	Tier 2 (PI needed to trigger write)	The agent can likely write to any file in the workspace, including config files	prompt-injection-chains
No mention of command allowlists or filtering	Tier 2 (agent runs commands during conversation)	Terminal commands are likely passed through with minimal filtering, or filtering is LLM-based (bypassable)	terminal-filter-bypass
No mention of rules file validation	Tier 1 if auto-loaded on open; Tier 2 if loaded on first message	Auto-loaded instruction files can likely inject arbitrary prompts	prompt-injection-chains
No mention of output sanitization or CSP	Tier 2 (renders during agent response)	Rendered output may allow image loading, Mermaid URLs, or webview exploitation	ai-ide-data-exfil
No mention of agent mode restrictions	Tier 2 (agent mode during conversation)	Agent/auto-run mode likely has the same capabilities as interactive mode, minus the approval prompts	ai-ide-code-exec

Compound Gaps

When multiple gaps appear together, the target is likely vulnerable to attack chains. The most dangerous combination:

No file-write restrictions + auto-loaded config files = an attacker can use prompt injection to write config files that the IDE loads on the next operation, achieving persistent code execution. Tier 2 for the initial trigger, potentially Tier 1 for persistence if the config auto-loads on next workspace open.
No output sanitization + MCP tool descriptions = tool descriptions can contain prompt injections that trigger data exfiltration through rendered output. Tier 2 (agent processes tool output during conversation).

Use ai-ide-attack-chains to model these compound vulnerabilities.

Black-Box Enumeration

When you lack source code, use runtime observation to discover config files and auto-load paths the documentation doesn't mention. Pay special attention to anything loaded before any trust dialog -- these are Tier 1 candidates.

Platform-Specific Techniques

Windows -- Process Monitor (Procmon)

Launch Procmon and set filters: Process Name = target IDE process, Operation = CreateFile / ReadFile / QueryDirectory.
Open a workspace in the IDE and let it initialize.
Filter results to the workspace directory tree.
Look for reads of dotfiles and dotdirectories (.cursor/, .vscode/, .cline/, etc.).
Note the order of reads -- files read early in startup are often config files that influence subsequent behavior. Files read before any trust dialog appears are Tier 1 candidates.

macOS -- fs_usage or Instruments

sudo fs_usage -w -f filesys <PID> where <PID> is the IDE process.
Alternatively, use Instruments with the File Activity template.
Open a workspace and capture file system events.
Filter for open() and stat() calls within the workspace directory.
Pay attention to files that are stat()-checked even if they don't exist -- the IDE is looking for them, which means creating them may influence behavior. Correlate timestamps with the trust dialog: reads before the dialog = Tier 1.

Linux -- strace

strace -e trace=openat,stat,statx -f -p <PID> 2>&1 | grep /path/to/workspace
Use -f to follow child processes (IDEs often spawn language servers and helper processes).
Look for ENOENT results on dotfiles in the workspace -- these are paths the IDE checks for but that don't exist in your test workspace.
Timestamp the trust dialog and compare against file access times to identify Tier 1 paths.

Cross-Platform Discovery

Create a test workspace that contains plausible config directories and files, then observe which ones the IDE reads:

test-workspace/
  .cursor/mcp.json
  .vscode/settings.json
  .vscode/mcp.json
  .cline/mcp_settings.json
  .windsurf/cascade.json
  .claude/settings.local.json
  .github/copilot-instructions.md
  .cursorrules
  .clinerules
  .windsurfrules
  CLAUDE.md
  .kiro/
  .devin/
  .codex/

For each file the IDE reads, test whether modifying its content changes IDE behavior. Start with benign modifications (change a display setting), then escalate to security-relevant modifications (add an MCP server, modify command allowlists). Record whether the modification takes effect before any trust dialog (Tier 1) or only after user interaction (Tier 2+).

See references/ide-config-locations.md for the complete table of known config paths.

Discovering Undocumented Paths

Beyond known paths, look for:

Environment variables the IDE reads that could influence behavior (e.g., MCP_SERVER_PATH, tool search paths).
IPC sockets or named pipes the IDE creates in the workspace -- these may accept commands.
Temporary files written during operations -- some IDEs write intermediate files that are later executed or parsed.
Extension-specific paths -- each extension may read its own config from the workspace.

Output: Tier-Annotated Attack Surface Map

Recon produces a structured, tier-annotated attack surface map that drives all subsequent testing. The map should contain:

Prioritized Feature List

Order features by tier first, then by security control quality within each tier:

Tier 1 -- Test Immediately Features that fire without any trust or user interaction. These are the highest-severity findings. List every auto-loaded config path, every race condition window, every pre-trust-dialog file read.

Tier 2 -- Test Next Features exploitable through prompt injection during normal user interaction. No separate approval click for the malicious action. List every PI-drivable capability: file writes, command execution, tool invocation.

Tier 3 -- Test If Time Permits Features requiring an explicit approval click. Only worth testing if the approval UX is misleading or the displayed information differs from what actually executes. Note the specific approval prompt for each.

Tier 4 -- Test for TOCTOU and Scope Escape Only Features requiring an already-trusted workspace plus a specific action. Test only for: (a) configs modifiable via git after trust was granted (TOCTOU), (b) workspace actions that write to global/user-level config (scope escape), (c) actions so routine they are guaranteed (git pull, file save).

Within each tier, sub-prioritize by security control quality:

No documented security controls -- highest probability of findings
Weak or vague controls -- "the AI decides" or "users are prompted" without specifics
Documented security controls -- test for bypasses

Feature-to-Skill Mapping

For each identified feature, record tier and the pattern-specific skill to use:

Feature Category	Typical Tier	Relevant Skill
Config auto-load on workspace open (MCP, hooks, tools)	Tier 1	mcp-config-poisoning, ai-ide-code-exec
Rules file auto-load on open	Tier 1	prompt-injection-chains
Initialization race condition (fires before trust dialog)	Tier 1	ai-ide-code-exec
Binary planting in workspace PATH	Tier 1	ai-ide-code-exec
PI-driven file write to config	Tier 2	prompt-injection-chains, ai-ide-attack-chains
PI-driven command execution	Tier 2	terminal-filter-bypass
Markdown/Mermaid rendering in agent response	Tier 2	ai-ide-data-exfil
MCP tool description injection	Tier 2	mcp-config-poisoning
MCP server requiring explicit approval click	Tier 3	mcp-config-poisoning (test for misleading UX)
Terminal command requiring "Allow" click	Tier 3	terminal-filter-bypass (test display vs. execution mismatch)
Hooks in trusted workspace, no TOCTOU	Tier 4	ai-ide-code-exec (test for TOCTOU or scope escape)
Source code available	Any	ai-ide-source-audit
Multiple features chainable	Any	ai-ide-attack-chains

The File-Write Assessment

This is one of several critical gate assessments. See the Security Gates section in the README for the full gate model. Answer:

Can the AI agent write files without user approval? If yes, the target is likely vulnerable to the full attack chain (prompt injection --> file write --> config modification --> code execution). This makes most Tier 2 chains viable.
Can the AI agent write to config files that the IDE auto-loads? If yes, a single prompt injection can achieve persistence -- and if the config auto-loads on next open, the persistence payload is Tier 1 on subsequent opens.
Is there a distinction between "agent-initiated" and "user-initiated" file writes? Some IDEs apply different approval models.
Are dotfiles/dotdirectories writable? Config files like .cursor/mcp.json or .vscode/settings.json are almost always in dotdirectories.

If file writes are unrestricted, most PI-driven attack chains become viable. However, file-write status is one of several security gates -- workspace config approval, initialization safety, trust integrity (TOCTOU), and outbound channel controls each independently block different chain types. See the README for the full gate model.

NOT a Vulnerability

Not every discovered feature or behavior is a security issue. The following are explicitly out of scope or not reportable as vulnerabilities. Recognizing these early avoids wasting time and maintains credibility with vendors.

User explicitly approved the exact action that executed. If the IDE shows a clear prompt naming the command, file path, or MCP server, and the user clicks "Allow," the approval gate worked as designed. This is Tier 3 functioning correctly. Exception: the displayed action differs from what actually executes (display vs. execution mismatch IS a vulnerability).

Behavior only in a workspace the user has already explicitly trusted, with no TOCTOU. If the user granted workspace trust through a clear dialog and the attacker cannot modify configs after trust was granted (no git-based TOCTOU, no scope escape), then execution within that trust boundary is by design. This is Tier 4 functioning correctly.

Prompt injection that the agent correctly refuses. If PI in workspace files attempts to make the agent take a malicious action and the agent refuses or asks for confirmation, the defense worked. Only report if you can bypass the refusal.

Data the user intentionally sent to the model. If a user pastes code into the chat and the model processes it, the user chose to share that data. This is not exfiltration.

Features that require the attacker to already have code execution on the victim's machine. If the attack prerequisite is "attacker can write arbitrary files to the victim's filesystem," you already have a more severe vulnerability than anything the IDE adds. The exception is supply-chain scenarios where a dependency or git repo is the delivery mechanism -- those are in scope.

Theoretical prompt injection without demonstrated impact. Showing that a rules file is loaded is not itself a vulnerability. You must demonstrate that the injected instructions cause the agent to take a harmful action (file write, command execution, data exfiltration) that the user did not intend and did not approve.

Self-inflicted configuration. If a developer adds a malicious MCP server to their own user-level config, they compromised themselves. Workspace-level configs planted by a third party (via a cloned repo) ARE in scope; user-level configs the user edited themselves are not.

Related Skills

This Plugin

After recon, proceed to pattern-specific testing based on your tier-annotated attack surface map. Start with skills targeting your Tier 1 findings, then Tier 2:

mcp-config-poisoning -- test MCP server configuration trust model, tool description injection, and workspace-level config overrides
terminal-filter-bypass -- test command filtering mechanisms for bypass patterns (newline injection, IFS manipulation, shell expansion, encoding tricks)
ai-ide-code-exec -- test hooks, IDE settings abuse, binary planting, environment variable prefixing, and other code execution vectors
prompt-injection-chains -- test auto-loaded rules files, adversarial directories, prompt template injection, and workspace trust escalation
ai-ide-data-exfil -- test markdown image rendering, Mermaid diagram abuse, webview exploitation, and pre-configured URL fetching
ai-ide-source-audit -- for open-source targets, perform source-level analysis of security controls identified during recon
ai-ide-attack-chains -- combine individual findings into end-to-end attack chains, especially when file-write capability is confirmed

External Skills

audit-context-building (Trail of Bits) -- for open-source targets, use this to build architectural context before diving into source-level analysis with ai-ide-source-audit

Similar Skills

ai-ide-source-audit

Audits source code of open-source AI IDEs for security vulnerabilities in command filtering, MCP integration, and file-write permissions. Use after recon for targeted reviews.

3 files

ai-ide-vuln-skills

atv-security

Scans agentic configs (.github/, .vscode/) using AgentShield's 33-rule taxonomy and application source code for OWASP Top 10 + STRIDE threats.

atv-starter-kit

cyber-neo

16 files17 tools

cyber-neo

Stats

Stars53

Forks6

Last CommitMar 3, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

AI IDE Reconnaissance

Interaction Tiers

All recon output is annotated with these tiers. Test in order -- Tier 1 first, Tier 4 last.

Tier	Label	Trigger Model	Severity	Reportability
Tier 1	Zero-Interaction (Untrusted Workspace)	No trust granted, no message sent. Victim clones repo and opens it. Config auto-loads, code auto-executes, race condition fires before trust dialog.	Critical	Highest -- vendors cannot argue "user chose to trust"
Tier 2	Agent-Mediated (User Sends a Message)	Trusted or untrusted workspace. User interacts normally -- sends a message, asks a question. PI in workspace files makes the agent act on attacker's behalf. No explicit approval of the malicious action.	High	Strong -- normal developer workflow is the trigger
Tier 3	Requires Approval Click	User must click "Trust," "Allow," or approve a specific action.	Medium	Weak -- vendors argue user made a conscious choice. Only interesting if approval is misleading, UI is confusing, or social engineering is trivial
Tier 4	Requires Trusted Workspace + Specific Action	User already trusted the project and must take a specific action.	Low standalone	Weakest standalone. Interesting ONLY with TOCTOU (trust granted then config modified via git), scope escape (workspace action writes global config), or routine-action guarantee (git pull)

Tier Boundary Rules

When classifying a feature, use these rules:

If the feature fires before any trust dialog or user message, it is Tier 1.
If the feature fires because the agent processes workspace content during a normal user interaction (no separate approval for the malicious action), it is Tier 2.
If the feature fires only after the user explicitly clicks an approval prompt that names the action, it is Tier 3.
If the feature fires only in an already-trusted workspace and requires a specific user-initiated action, it is Tier 4 -- unless TOCTOU or scope escape applies, which elevates it.

When to Use

Starting any AI IDE security assessment -- before writing a single payload, you need to know what the target supports, what it auto-loads, and where its trust boundaries are.
Evaluating a new IDE for the first time -- when a new AI coding assistant launches or a major version ships, recon tells you what changed and what attack surface was introduced.
Analyzing documentation before black-box testing -- extracting security-relevant information from docs is faster than reverse engineering, and the gaps in documentation are themselves a signal.
Enumerating config files and auto-load paths -- discovering which files the IDE reads from the workspace on startup, and which of those can trigger code execution or modify agent behavior.
Monitoring changelogs for new attack surface -- when an IDE ships an update, use the changelog keyword reference to quickly identify security-relevant changes.

Target Classification

Before starting, classify your target. This determines the workflow:

Target Type	What You Have	Workflow
Open-source	Full source code access (e.g., Cline, Continue, parts of Cursor)	Docs analysis --> source audit (use ai-ide-source-audit) --> pattern skills
Closed-source	Documentation + binary only (e.g., Copilot backend, Cursor core)	Docs analysis --> black-box enumeration --> pattern skills
Hybrid	Partial source -- extensions open, core closed (e.g., VS Code + Copilot extension)	Combine both workflows: source audit on the open parts, black-box on the closed core

Documentation Analysis

Features to Identify

For each of the following, search the IDE's documentation, blog posts, and configuration references. Annotate every feature with its likely tier using the rules above.

MCP / Tool Integration

Does the IDE support MCP (Model Context Protocol) servers or equivalent tool integrations?
Where is the MCP configuration stored? Per-workspace, per-user, or both?
What approval model exists for new MCP servers? Auto-approve, prompt-once, prompt-every-time?
Can MCP tool descriptions contain instructions that influence the LLM?
Tier signal: Per-workspace MCP config that auto-loads without approval = Tier 1. MCP config that activates when the agent processes a task = Tier 2. MCP requiring explicit "Trust this server" click = Tier 3.

File-Write Permissions Model

Can the AI agent write files without user approval?
Is there a distinction between creating new files vs. modifying existing ones?
Are there path restrictions (workspace-only, no dotfiles, etc.)?
What happens when the agent writes to config files that the IDE itself reads?
Tier signal: Auto-write to config files on workspace open = Tier 1. PI-driven file write with no per-write approval = Tier 2. Each write requires explicit approval = Tier 3.

Command Execution Controls

How does the IDE filter or approve terminal commands?
Is there an allowlist, blocklist, or LLM-based classification?
What shell is used? Is the shell configurable?
Are there known bypass patterns for the filtering mechanism?
Tier signal: Commands auto-executed from workspace config (hooks, build tasks) = Tier 1. Commands run by the agent during normal conversation = Tier 2. Commands requiring explicit user "Run" click = Tier 3.

Prompt Template / Rules Loading

Does the IDE auto-load instruction files from the workspace (.cursorrules, CLAUDE.md, .clinerules, .github/copilot-instructions.md)?
Can these files override safety instructions or system prompts?
Is there a trust model for these files (e.g., require user approval on first load)?
Tier signal: Rules auto-loaded silently on workspace open = Tier 1. Rules loaded when user sends first message = Tier 2. Rules requiring approval prompt = Tier 3.

Output Rendering

Does the IDE render markdown in responses? If so, does it render images (potential exfil via URL)?
Is Mermaid diagram rendering supported? (Mermaid can embed URLs)
Are webviews used? If so, what Content Security Policy is applied?
Can rendered output trigger navigation or resource loading?
Tier signal: Auto-rendered output from workspace content without user message = Tier 1 (rare). Rendered during agent response to user query = Tier 2. Rendered only after user clicks to expand or preview = Tier 3.

Workspace Trust Model

Does the IDE have a workspace trust concept (like VS Code's Workspace Trust)?
What features are restricted in untrusted workspaces?
Can trust be escalated from within the workspace (e.g., by modifying a settings file)?
Tier signal: No workspace trust concept at all (everything auto-loads) = every auto-load feature is Tier 1. Trust dialog present but bypassable or raceable = Tier 1. Trust dialog solid = features behind it are Tier 3 or Tier 4.

Hooks / Lifecycle Events

Does the IDE support pre/post hooks for operations (save, commit, build)?
Where are hooks configured? Can workspace-level config define hooks?
Are hooks executed automatically or do they require approval?
Tier signal: Hooks from workspace config that fire on open or first save = Tier 1. Hooks that fire when agent performs an action during conversation = Tier 2. Hooks requiring explicit approval = Tier 3. Hooks only in trusted workspaces = Tier 4 (unless TOCTOU).

Agent / Auto-Run Mode

Does the IDE have an autonomous or "agent" mode that executes without per-action approval?
What capabilities are available in agent mode vs. interactive mode?
Is there a sandbox or capability restriction for agent mode?
Tier signal: Agent mode auto-starts on workspace open = Tier 1. Agent mode activated by user sending a message = Tier 2. Agent mode requiring explicit opt-in = Tier 3.

Local Network Services

Does the IDE expose HTTP servers on localhost during operation?
Are these authenticated? Do they require tokens, API keys, or session cookies?
Do they have permissive CORS headers (e.g., Access-Control-Allow-Origin: *)?
What endpoints are exposed and what operations do they permit?
Can a malicious web page or local process interact with these services?
Test methodology: Compare lsof -nP -iTCP -sTCP:LISTEN before and after launching the IDE. Any new listeners are candidates. Probe each with unauthenticated requests and cross-origin requests from a browser tab.
Tier signal: Unauthenticated local service with permissive CORS that accepts commands = Tier 1 (any local process or malicious web page can exploit). Authenticated service with restrictive CORS = Tier 3 or higher. Service only listening on a Unix socket = lower risk but still worth documenting.

Recording Findings

For each feature identified, record:

Interaction tier -- Tier 1, 2, 3, or 4 based on the classification rules
Configuration mechanism -- file path and format, UI setting, or API
Approval model -- none, one-time, per-session, per-action
Trust boundary crossed -- does this feature allow workspace content to influence IDE behavior, execute code, or access resources outside the workspace?

See references/doc-analysis-checklist.md for the full structured checklist with checkboxes.

Blind Spot Detection

This is the core technique for closed-source assessment and the highest-value output of recon. The principle: what the documentation does not mention is what you should test first.

Method

Take the complete vulnerability taxonomy (4 classes, ~24 patterns) as your reference list.
For each pattern, search the target's documentation for evidence of awareness and mitigation.
Score each pattern: documented and mitigated, mentioned but vague, or not mentioned.
Assign a likely tier to each gap based on what trigger model the pattern requires.
Prioritize testing: Tier 1 "not mentioned" patterns first, then Tier 2 "not mentioned," then Tier 1 "vague," and so on.

Specific Gaps and What They Signal

Documentation Gap	Likely Tier	What It Likely Means	Skill to Use
No mention of MCP config validation or trust model	Tier 1 if workspace config auto-loads; Tier 2 if PI-driven	MCP configs from the workspace are likely loaded without verification	mcp-config-poisoning
No mention of hook security or approval	Tier 1 if hooks fire on open; Tier 2 if agent-triggered	Hooks may auto-execute from workspace config	ai-ide-code-exec
No mention of workspace trust or file-write restrictions	Tier 2 (PI needed to trigger write)	The agent can likely write to any file in the workspace, including config files	prompt-injection-chains
No mention of command allowlists or filtering	Tier 2 (agent runs commands during conversation)	Terminal commands are likely passed through with minimal filtering, or filtering is LLM-based (bypassable)	terminal-filter-bypass
No mention of rules file validation	Tier 1 if auto-loaded on open; Tier 2 if loaded on first message	Auto-loaded instruction files can likely inject arbitrary prompts	prompt-injection-chains
No mention of output sanitization or CSP	Tier 2 (renders during agent response)	Rendered output may allow image loading, Mermaid URLs, or webview exploitation	ai-ide-data-exfil
No mention of agent mode restrictions	Tier 2 (agent mode during conversation)	Agent/auto-run mode likely has the same capabilities as interactive mode, minus the approval prompts	ai-ide-code-exec

Compound Gaps

When multiple gaps appear together, the target is likely vulnerable to attack chains. The most dangerous combination:

No file-write restrictions + auto-loaded config files = an attacker can use prompt injection to write config files that the IDE loads on the next operation, achieving persistent code execution. Tier 2 for the initial trigger, potentially Tier 1 for persistence if the config auto-loads on next workspace open.
No output sanitization + MCP tool descriptions = tool descriptions can contain prompt injections that trigger data exfiltration through rendered output. Tier 2 (agent processes tool output during conversation).

Use ai-ide-attack-chains to model these compound vulnerabilities.

Black-Box Enumeration

Platform-Specific Techniques

Windows -- Process Monitor (Procmon)

Launch Procmon and set filters: Process Name = target IDE process, Operation = CreateFile / ReadFile / QueryDirectory.
Open a workspace in the IDE and let it initialize.
Filter results to the workspace directory tree.
Look for reads of dotfiles and dotdirectories (.cursor/, .vscode/, .cline/, etc.).
Note the order of reads -- files read early in startup are often config files that influence subsequent behavior. Files read before any trust dialog appears are Tier 1 candidates.

macOS -- fs_usage or Instruments

sudo fs_usage -w -f filesys <PID> where <PID> is the IDE process.
Alternatively, use Instruments with the File Activity template.
Open a workspace and capture file system events.
Filter for open() and stat() calls within the workspace directory.
Pay attention to files that are stat()-checked even if they don't exist -- the IDE is looking for them, which means creating them may influence behavior. Correlate timestamps with the trust dialog: reads before the dialog = Tier 1.

Linux -- strace

strace -e trace=openat,stat,statx -f -p <PID> 2>&1 | grep /path/to/workspace
Use -f to follow child processes (IDEs often spawn language servers and helper processes).
Look for ENOENT results on dotfiles in the workspace -- these are paths the IDE checks for but that don't exist in your test workspace.
Timestamp the trust dialog and compare against file access times to identify Tier 1 paths.

Cross-Platform Discovery

Create a test workspace that contains plausible config directories and files, then observe which ones the IDE reads:

test-workspace/
  .cursor/mcp.json
  .vscode/settings.json
  .vscode/mcp.json
  .cline/mcp_settings.json
  .windsurf/cascade.json
  .claude/settings.local.json
  .github/copilot-instructions.md
  .cursorrules
  .clinerules
  .windsurfrules
  CLAUDE.md
  .kiro/
  .devin/
  .codex/

See references/ide-config-locations.md for the complete table of known config paths.

Discovering Undocumented Paths

Beyond known paths, look for:

Environment variables the IDE reads that could influence behavior (e.g., MCP_SERVER_PATH, tool search paths).
IPC sockets or named pipes the IDE creates in the workspace -- these may accept commands.
Temporary files written during operations -- some IDEs write intermediate files that are later executed or parsed.
Extension-specific paths -- each extension may read its own config from the workspace.

Output: Tier-Annotated Attack Surface Map

Recon produces a structured, tier-annotated attack surface map that drives all subsequent testing. The map should contain:

Prioritized Feature List

Order features by tier first, then by security control quality within each tier:

Within each tier, sub-prioritize by security control quality:

No documented security controls -- highest probability of findings
Weak or vague controls -- "the AI decides" or "users are prompted" without specifics
Documented security controls -- test for bypasses

Feature-to-Skill Mapping

For each identified feature, record tier and the pattern-specific skill to use:

Feature Category	Typical Tier	Relevant Skill
Config auto-load on workspace open (MCP, hooks, tools)	Tier 1	mcp-config-poisoning, ai-ide-code-exec
Rules file auto-load on open	Tier 1	prompt-injection-chains
Initialization race condition (fires before trust dialog)	Tier 1	ai-ide-code-exec
Binary planting in workspace PATH	Tier 1	ai-ide-code-exec
PI-driven file write to config	Tier 2	prompt-injection-chains, ai-ide-attack-chains
PI-driven command execution	Tier 2	terminal-filter-bypass
Markdown/Mermaid rendering in agent response	Tier 2	ai-ide-data-exfil
MCP tool description injection	Tier 2	mcp-config-poisoning
MCP server requiring explicit approval click	Tier 3	mcp-config-poisoning (test for misleading UX)
Terminal command requiring "Allow" click	Tier 3	terminal-filter-bypass (test display vs. execution mismatch)
Hooks in trusted workspace, no TOCTOU	Tier 4	ai-ide-code-exec (test for TOCTOU or scope escape)
Source code available	Any	ai-ide-source-audit
Multiple features chainable	Any	ai-ide-attack-chains

The File-Write Assessment

This is one of several critical gate assessments. See the Security Gates section in the README for the full gate model. Answer:

Can the AI agent write files without user approval? If yes, the target is likely vulnerable to the full attack chain (prompt injection --> file write --> config modification --> code execution). This makes most Tier 2 chains viable.
Can the AI agent write to config files that the IDE auto-loads? If yes, a single prompt injection can achieve persistence -- and if the config auto-loads on next open, the persistence payload is Tier 1 on subsequent opens.
Is there a distinction between "agent-initiated" and "user-initiated" file writes? Some IDEs apply different approval models.
Are dotfiles/dotdirectories writable? Config files like .cursor/mcp.json or .vscode/settings.json are almost always in dotdirectories.

NOT a Vulnerability

Data the user intentionally sent to the model. If a user pastes code into the chat and the model processes it, the user chose to share that data. This is not exfiltration.

Related Skills

This Plugin

After recon, proceed to pattern-specific testing based on your tier-annotated attack surface map. Start with skills targeting your Tier 1 findings, then Tier 2:

mcp-config-poisoning -- test MCP server configuration trust model, tool description injection, and workspace-level config overrides
terminal-filter-bypass -- test command filtering mechanisms for bypass patterns (newline injection, IFS manipulation, shell expansion, encoding tricks)
ai-ide-code-exec -- test hooks, IDE settings abuse, binary planting, environment variable prefixing, and other code execution vectors
prompt-injection-chains -- test auto-loaded rules files, adversarial directories, prompt template injection, and workspace trust escalation
ai-ide-data-exfil -- test markdown image rendering, Mermaid diagram abuse, webview exploitation, and pre-configured URL fetching
ai-ide-source-audit -- for open-source targets, perform source-level analysis of security controls identified during recon
ai-ide-attack-chains -- combine individual findings into end-to-end attack chains, especially when file-write capability is confirmed

External Skills

audit-context-building (Trail of Bits) -- for open-source targets, use this to build architectural context before diving into source-level analysis with ai-ide-source-audit