From ai-security-skills
Tests LLM-integrated apps for prompt injection vulnerabilities using Arcanum PI Taxonomy's 13 intents, 18 techniques, and 20 evasions. Use for red-teaming AI apps, guardrail validation, and OWASP LLM01 assessments.
npx claudepluginhub cmaenner/agent-security-playbookThis skill uses the workspace's default tool permissions.
Systematically test an LLM application's prompt injection defenses by following the full procedure in `plays/tier4-ai-security/prompt-injection-testing.md`.
Detects prompt injection attacks in LLM inputs using regex patterns, heuristic scoring, and DeBERTa classification. Scans for direct/indirect injections before model forwarding.
Tests LLM applications for OWASP Top 10 vulnerabilities using 10 specialized agents. Integrates with pentest workflows for comprehensive AI security assessments.
Detects prompt injection attacks in LLM inputs using regex patterns, heuristic scoring, and DeBERTa classification. Scans user inputs for chatbots, RAG pipelines, and AI security before reaching the model.
Share bugs, ideas, or general feedback.
Systematically test an LLM application's prompt injection defenses by following the full procedure in plays/tier4-ai-security/prompt-injection-testing.md.
Based on the Arcanum PI Taxonomy by Jason Haddix (Arcanum Information Security). CC BY 4.0.
Scope and Input Surface Mapping — Identify all paths where attacker-controlled content reaches the LLM: direct (chat, API params) and indirect (file uploads, web fetches, RAG docs, tool outputs, MCP resources).
Test by Attack Intent (13 intents) — For each authorized intent, attempt to achieve the attacker's goal:
Test by Attack Technique (18 techniques) — Apply known payload construction methods:
Apply Evasion Layers (20 evasions) — When techniques are blocked, retry with obfuscation:
Execute Test Matrix — Combine intents x techniques x evasions. Prioritize: high-impact intents first, indirect surfaces second, evasion sweeps against defenses that blocked direct attempts.
Assess Results — For each successful injection, document: severity, attack path (intent + technique + evasion + surface), exact payload, detection gap, and remediation.
Defense Validation — Check the 5-layer defense checklist: ecosystem hardening, model guardrails, prompt-layer defenses, data-layer controls, application-layer validation.
Test results summary table (intent / technique / evasion / surface / result / severity), detailed findings using templates/finding.md, defense coverage checklist with gaps highlighted, and prioritized recommendations.