Skill

ai-ide-attack-chains

Plans and constructs multi-stage attack chains against AI IDEs by combining primitives like prompt injection and file writes. Classifies by interaction tier to assess security posture and prioritize reports.

security

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/ai-ide-vuln-skills:ai-ide-attack-chains

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Individual vulnerability primitives -- prompt injection, config poisoning, filter bypass, exfil channels -- combine into multi-stage attack chains. A PI alone may be low severity; PI plus file write plus config auto-reload is critical. This skill helps construct these chains from confirmed primitives, classify them by interaction tier, and assess their real-world severity.

Supporting Files

references/chain-templates.mdreferences/disclosed-chains.mdreferences/trigger-model-matrix.md

SKILL.md

319 lines · ~5.2k tokens(exceeds 5k compaction limit)

Stats

Stars60

Forks8

MaintenanceGood

Last CommitMar 9, 2026

Actions

View Source View Plugin View on GitHub View README

AI IDE Attack Chains

Run this skill after testing with pattern-specific skills has identified individual primitives. It is the final step before writing up findings.

When to Use

After pattern-specific testing has identified individual vulnerability primitives (PI works, file writes are possible, MCP config is auto-loaded, etc.).
When assessing overall IDE security posture -- you need to combine individual findings into a severity assessment that accounts for chaining.
When constructing proof-of-concept exploits -- you need to build an end-to-end PoC that demonstrates real impact, not just an isolated primitive.
When writing vulnerability reports -- the chain model helps communicate why a seemingly low-severity PI is actually critical when combined with file writes.
When triaging findings by reportability -- tier classification determines whether a vendor will accept or reject the report.

Interaction Tiers

Each chain maps to a tier based on the user interaction required to trigger it. Test in priority order -- Tier 1 first.

Tier	Label	User Interaction	Reportability
Tier 1	Zero-Interaction	None. Clone and open.	Highest -- vendors cannot argue "user chose to trust."
Tier 2	Agent-Mediated	User sends a message or asks a question. No explicit approval of the malicious action.	Strong -- normal developer workflow triggers the chain.
Tier 3	Approval-Gated	User must click "Trust," "Allow," or approve a specific action.	Weak unless approval is misleading or social engineering is trivial.
Tier 4	Trusted + Specific Action	User already trusted the project and takes a specific routine action.	Weakest standalone. Interesting with TOCTOU, scope escape, or guaranteed actions (git pull).

Preconditions

Before constructing chains, confirm these primitives using the pattern-specific skills listed:

Primitive	Confirm With	Required For Chains
PI susceptibility	prompt-injection-chains	Classic, Exfil Express, Persistence Play
File write (no approval)	prompt-injection-chains	Classic (Tier 2), Persistence Play
File write (approval-gated)	prompt-injection-chains	Classic (Tier 3)
MCP config auto-load (no approval)	mcp-config-poisoning	Zero-Click Config
MCP config auto-load (approval-gated)	mcp-config-poisoning	Classic via MCP pivot
Terminal command execution	terminal-filter-bypass	Chain escalation to RCE
Outbound channel (image, URL, DNS)	ai-ide-data-exfil	Exfil Express
Rules file auto-load	prompt-injection-chains	Persistence Play
Hooks/settings auto-execution	ai-ide-code-exec	Zero-Click Config, Classic escalation
TOCTOU in approval model	ai-ide-code-exec, mcp-config-poisoning	Persistence Play (Tier 4), The Long Con
One-time config approval (path/name-keyed)	mcp-config-poisoning, ai-ide-code-exec	The Long Con

Do not attempt chain construction until at least two primitives are confirmed. A single primitive is a finding, not a chain.

The Five Canonical Chains

1. Zero-Click Config [Tier 1]

Malicious Config Present --> IDE Opens Workspace --> Code Execution

Entry: Malicious config file already in the repository. No PI needed.

Impact: Code execution on workspace open.

Requires: Config auto-loading without approval.

Tier rationale: No user interaction beyond cloning and opening. The user has not granted trust, sent a message, or clicked any approval. This is the highest-severity class.

Real-world examples:

Roo Code MCP config RCE (GHSA-5x8h-m52g-5v54)
Eclipse Theia MCP config RCE
Zed IDE MCP config RCE
Gemini CLI MCP config RCE
Codex CLI MCP config RCE
Mistral Vibe CLI MCP config RCE
JetBrains PATH_TO_GIT (IDEaster)

2. The Exfil Express [Tier 2]

Cloned Repo --> PI in README --> Data Exfiltration

Entry: PI in visible workspace content.

No file write needed -- PI directly triggers exfil through an outbound channel.

Impact: Sensitive data (API keys, env vars, source code) sent to attacker.

Requires: PI susceptibility + outbound channel (image rendering, URL fetch, DNS).

Tier rationale: Requires the user to send a message or interact with the agent (triggering PI processing), but no explicit approval of the exfil action. Normal developer actions ("explain this code," "summarize the README") complete the chain.

Real-world examples:

3. The Classic Chain [Tier 2 or Tier 3]

Cloned Repo --> Hidden PI --> File Write --> Config Modification --> Code Execution

Entry: Malicious content in a cloned repository (README, code comments, hidden text).

Pivot: PI causes the agent to write a file. This is the critical gate.

Escalation: The written file modifies IDE config:

.vscode/settings.json --> executable path override --> RCE
MCP config --> malicious server definition --> RCE
Rules file --> persistent behavior modification --> eventual RCE

Impact: Arbitrary code execution with the user's permissions.

Requires: PI susceptibility + file write + config auto-reload.

Tier classification:

[Tier 2] if the file write requires no approval -- PI directly causes the agent to write without any user confirmation. This is the strong reportable variant.
[Tier 3] if the file write requires an approval click -- user must click "Allow" or confirm a diff. Only interesting if the approval UI is misleading, the diff is obfuscated, or social engineering is trivial.

Real-world examples:

4. The Persistence Play [Tier 4]

PI --> File Write --> Rules Override --> Persistent Backdoor

Entry: PI from any source.

Pivot: File write to a rules/instruction file.

Persistence: Modified rules survive sessions, infect future interactions.

Impact: Long-term backdoor in IDE behavior. Every future agent interaction in the workspace follows attacker-controlled instructions.

Requires: PI susceptibility + file write + rules auto-loading without re-approval.

Tier rationale: Typically requires a trusted workspace context and a specific triggering action. Escalates to a serious finding when combined with TOCTOU -- trust was granted to a benign config that is later modified via git commit, and the IDE does not re-prompt for approval. Without TOCTOU or scope escape, this is Tier 4 (weakest standalone).

Real-world examples:

5. The Long Con [Tier 4]

Benign Config Approved --> Attacker Modifies via Git Commit --> Victim Does Git Pull --> Modified Config Loads Silently --> Code Execution

Entry: Benign config in a shared repository. The config passes review and receives one-time approval from the victim.

Pivot: Trust persists across config modifications. The IDE's approval model is keyed to the config's path or name, not its content hash. Once approved, the config is trusted indefinitely regardless of subsequent changes.

Escalation: The attacker modifies the approved config in a later git commit -- adding a malicious MCP server, changing a hook command, or injecting a rules override. The victim pulls the change as part of normal workflow. The IDE loads the modified config without re-prompting for approval.

Impact: Arbitrary code execution via a config the victim previously reviewed and approved.

Requires: One-time approval of the initial benign config + no re-approval on modification (approval keyed by path/name rather than content hash) + shared repository where the attacker can push commits.

Tier rationale: This is Tier 4 because it requires the victim to have already trusted the workspace and to perform a specific action (git pull). However, git pull is so routine and guaranteed in collaborative workflows that the "specific action" requirement is effectively automatic. The time delay between approval and exploitation makes this particularly insidious -- the victim's trust decision was correct at the time it was made.

Real-world examples:

Cursor MCPoison (Checkpoint Research) -- MCP config approved then modified via git
Cline TOCTOU (Mindgard) -- Approval persists across config changes
Claude Code TOCTOU (Mindgard) -- Trust not re-evaluated on config modification

Chain Viability by Tier

Decision matrix: given the confirmed primitives, which chains are viable at which tier?

Confirmed Primitives	Zero-Click Config	Exfil Express	The Classic	Persistence Play	The Long Con
Config auto-load (no approval)	Tier 1 -- GO	--	--	--	--
PI + outbound channel (no approval on exfil)	--	Tier 2 -- GO	--	--	--
PI + file write (no approval) + config auto-reload	--	--	Tier 2 -- GO	Tier 2-3 if rules auto-load	--
PI + file write (approval-gated) + config auto-reload	--	--	Tier 3 -- only if approval UI is weak	Tier 3-4	--
PI + file write + rules auto-load + TOCTOU	--	--	--	Tier 4 -- GO (TOCTOU elevates)	--
One-time config approval + no re-approval on modification + shared repo	--	--	--	--	Tier 4 -- GO
PI only (no file write, no outbound channel)	--	--	Blocked	Blocked	--
No PI, no auto-load config	--	--	Blocked	Blocked	Blocked
Outbound channel requires user-triggered action with approval	--	Tier 3 at best -- weak	--	--	--

Reading the matrix: "GO" means the chain is viable and reportable at that tier. Dashes mean the chain is not applicable. Blocked means missing primitives prevent chain construction. When a cell shows a tier without "GO," the finding is marginal -- assess social engineering difficulty before reporting.

NOT a Vulnerability

These counter-examples help testers avoid wasting time on findings vendors will reject.

Classic Chain in a trusted workspace without TOCTOU. If the user explicitly trusted the workspace, PI leading to file write and config modification is operating within the trust boundary the user accepted. Without TOCTOU (approval was one-time but the config is later modified via git without re-approval), this is expected behavior in a trusted context. The user chose to trust; the system honored that trust.

MCP config auto-load after explicit informed approval. If the IDE shows the user the full MCP server command before execution, requires explicit approval, and re-prompts on any change to the config, the approval gate is functioning correctly. A malicious MCP config in a repo that the user reviews and approves is not a vulnerability -- it is the security model working as designed.

Any chain where every step requires explicit informed user approval. If the user must approve the PI-triggered file write (with a clear diff showing the malicious content), then separately approve the config change, then separately approve execution -- no security boundary has been bypassed. The chain exists but the gates held. This is Tier 3 at best and only reportable if the approval UI actively misleads.

Exfil Express where the outbound channel requires user-triggered action with approval. If exfiltrating data requires the user to explicitly trigger a fetch, render, or network action and approve it with knowledge of the destination, the channel is gated. A PI that prepares exfil content but cannot send it without informed user action is not a complete chain.

General principle: A vulnerability exists when a security boundary is absent, bypassable, or misleading. When the boundary is present, correctly implemented, and the user makes an informed decision, the remaining risk is social engineering -- which is a user education problem, not a product vulnerability.

Chain Construction Methodology

Step 1: Inventory Primitives

List confirmed vulnerabilities from pattern-specific skills:

Primitive	Skill Used	Notes
PI susceptibility	prompt-injection-chains	Via what delivery?
File write (no approval)	prompt-injection-chains	To what paths?
File write (with approval)	prompt-injection-chains	How easy to social-engineer?
MCP config auto-load	mcp-config-poisoning	Zero-click or approval-gated?
Terminal command execution	terminal-filter-bypass	Any filter? Bypassed?
Image rendering (outbound)	ai-ide-data-exfil	Blocked or unblocked?
Rules file auto-load	prompt-injection-chains	Which rules files?
Hooks/settings execution	ai-ide-code-exec	Zero-click?
TOCTOU in approval model	mcp-config-poisoning, ai-ide-code-exec	Approval keyed by path or content hash?

Step 2: Identify the Pivot

Which primitive provides the PI --> file write transition? This is the critical link that separates low-severity PI from critical RCE chains.

No approval needed: PI --> file write is trivial. All chains are viable. Classify as Tier 2.
Approval needed, social-engineerable: PI includes convincing justification ("updating config for CI compatibility"). Classify as Tier 3. Most chains viable with effort.
Strict approval with diff: Chains require the user to not notice malicious content in the diff. Lower likelihood. Tier 3 at best.
No file write: Only Exfil Express is viable. Focus on direct exfil channels.

Step 3: Map Escalation Paths

From file write, what configs can be modified?

File Write
  +-- .vscode/settings.json --> php.validate.executablePath --> RCE
  +-- MCP config file --> malicious server --> RCE
  +-- Rules file --> persistent behavior change --> eventual RCE
  +-- Hooks config --> lifecycle hook execution --> RCE
  +-- .code-workspace --> trust boundary expansion --> RCE
  +-- .gitconfig --> git external diff --> RCE
  +-- URL-fetching config --> data exfiltration

Step 4: Determine Trigger Model

How does the chain activate?

Model	Description	Attacker Effort	User Interaction	Tier
Zero-click	Triggers on workspace open	Config file in repo	None	Tier 1
One-click	Requires single user action	PI + trigger action	"Explain this code," "Review the README"	Tier 2
Autorun	Triggers in agent/autonomous mode	PI in workspace	User enables agent mode	Tier 2
Approval-gated	Requires explicit user approval	PI + social engineering	User clicks "Allow" on prompted action	Tier 3
Time-delayed	Triggers via future commit (TOCTOU)	Benign config initially, malicious in later commit	None after initial setup	Tier 4

Step 5: Construct PoC

Build the end-to-end exploit using templates from references/chain-templates.md:

Create the malicious workspace (repository).
Embed PI payload(s) in appropriate files.
Include any necessary config files or planted binaries.
Document the trigger sequence and classify the tier.
Test end-to-end.

Step 6: Assess Severity

Tier	Trigger Model	Chain Length	Impact	Severity
Tier 1	Zero-click	1 step (config --> exec)	RCE	Critical
Tier 1	Zero-click	2+ steps	RCE	High-Critical
Tier 2	One-click / Autorun	Any	RCE	High
Tier 2	One-click	Any	Data exfil	Medium-High
Tier 3	Approval-gated	Any	RCE	Medium (depends on approval UI quality)
Tier 4	Time-delayed (TOCTOU)	Any	RCE	Medium-High (TOCTOU elevates)

The File-Write Gate

One of several security gates to assess. File-write status determines which PI-driven chains are viable, but other gates (workspace config approval, initialization safety, outbound controls) independently block other chain types. See the README for the full gate model.

Can the AI write files without approval?

YES --> All chains viable. Severity: Critical. This is the most important finding. Classify file-write-dependent chains as Tier 2.
NO --> Classic Chain is blocked at Tier 2. Only Zero-Click Config and Exfil Express remain at Tier 1-2. Classic drops to Tier 3.
Partial (writes with approval) --> Assess social engineering difficulty. Can PI craft a convincing justification for the write? Tier 3.

Can the AI write to config files the IDE auto-loads?

YES --> Single PI achieves persistence. Every future session is compromised.
NO (path restriction) --> Chain requires a bypass (symlinks, relative paths, multi-root workspace).

Does the IDE re-approve on config change?

NO --> TOCTOU is viable. Persistence Play at Tier 4.
YES (re-approval on change, keyed to content hash) --> TOCTOU is blocked. See "NOT a Vulnerability" section.

See references/trigger-model-matrix.md for the per-IDE matrix.

Related Skills

This Plugin

ai-ide-recon provides the initial attack surface map that determines which chains are possible.
mcp-config-poisoning confirms MCP primitives for Zero-Click Config and Classic Chain.
terminal-filter-bypass confirms command execution primitives.
ai-ide-code-exec confirms hooks, binary planting, and IDE settings primitives.
prompt-injection-chains confirms PI susceptibility and file-write capability -- the critical gate.
ai-ide-data-exfil confirms exfiltration channels for Exfil Express.
ai-ide-source-audit provides code-level understanding of why chains work or don't.

Trail of Bits Skills

audit-context-building helps understand the architectural trust model that chains exploit.

ai-ide-attack-chains

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

ai-ide-attack-chains

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

AI IDE Attack Chains

When to Use

Interaction Tiers

Preconditions

The Five Canonical Chains

1. Zero-Click Config [Tier 1]

2. The Exfil Express [Tier 2]

3. The Classic Chain [Tier 2 or Tier 3]

4. The Persistence Play [Tier 4]

5. The Long Con [Tier 4]

Chain Viability by Tier

NOT a Vulnerability

Chain Construction Methodology

Step 1: Inventory Primitives

Step 2: Identify the Pivot

Step 3: Map Escalation Paths

Step 4: Determine Trigger Model

Step 5: Construct PoC

Step 6: Assess Severity

The File-Write Gate

Related Skills

This Plugin

Trail of Bits Skills

Similar Skills

AI IDE Attack Chains

When to Use

Interaction Tiers

Preconditions

The Five Canonical Chains

1. Zero-Click Config [Tier 1]

2. The Exfil Express [Tier 2]

3. The Classic Chain [Tier 2 or Tier 3]

4. The Persistence Play [Tier 4]

5. The Long Con [Tier 4]

Chain Viability by Tier

NOT a Vulnerability

Chain Construction Methodology

Step 1: Inventory Primitives

Step 2: Identify the Pivot

Step 3: Map Escalation Paths

Step 4: Determine Trigger Model

Step 5: Construct PoC

Step 6: Assess Severity

The File-Write Gate

Related Skills

This Plugin

Trail of Bits Skills

Similar Skills