From secure-sdlc-agents
Analyzes security risks in AI/LLM features including prompt injection, excessive agency, RAG systems, agents, and output handling per OWASP Top 10 for LLMs.
npx claudepluginhub kaademos/secure-sdlc-agents --plugin secure-sdlc-agentsThis skill uses the workspace's default tool permissions.
This skill applies structured security analysis to AI and LLM-powered features.
Tests LLM applications for OWASP Top 10 vulnerabilities using 10 specialized agents. Integrates with pentest workflows for comprehensive AI security assessments.
Flags code treating LLM output as authoritative without safeguards, like direct UI display or automated pipelines, and suggests fixes with confidence gates, disclaimers, human review, and audit logs.
Provides 6 AI engineering workflows: prompt evaluation (8D scoring), context budget planning, RAG pipeline design, agent security audit (65-pt checklist), eval harness building, product sense coaching. For LLM production systems.
Share bugs, ideas, or general feedback.
This skill applies structured security analysis to AI and LLM-powered features. The threat categories here — prompt injection, excessive agency, output misuse, supply chain — did not exist before 2023 and are still being misunderstood by most developers shipping AI features today.
Working assumption: every model is a trust boundary, not a trusted component. Model outputs must be treated as untrusted user input to every downstream system.
Reference framework: OWASP Top 10 for LLMs 2025 (LLM01–LLM10).
Before finding vulnerabilities, enumerate:
| Question | Why it matters |
|---|---|
| Who sends input to the model? | Determines direct injection risk |
| What external sources feed the prompt context? | Determines indirect injection risk |
| What tools / functions can the model invoke? | Determines excessive agency blast radius |
| What happens to the model's output? | Determines output handling risk |
| Is user PII sent to a third-party API? | Determines data leakage and legal risk |
| Where does the model or its weights come from? | Determines supply chain risk |
Input trust classification:
| Input Source | Trust Level | Injection Risk |
|---|---|---|
| Authenticated user (UI) | LOW | Direct prompt injection |
| Public / unauthenticated user | UNTRUSTED | Direct + jailbreak attempts |
| Retrieved document (RAG) | UNTRUSTED | Indirect prompt injection |
| Tool / function call result | MEDIUM | Injection via external API response |
| Database query result | MEDIUM | Injection via poisoned records |
| Web scraping / search | UNTRUSTED | Indirect injection |
Mitigations to verify:
Excessive agency is the most dangerous risk for agentic systems. A model tricked via prompt injection into misusing its tool access can exfiltrate data, delete records, or send external requests — all without the user's knowledge.
Review checklist:
Key principle: model outputs are untrusted input. Validate before acting. Require explicit human confirmation for destructive or high-value operations.
| Model output used as… | Risk | Required mitigation |
|---|---|---|
| Rendered in HTML / DOM | Stored XSS | DOMPurify, output encoding |
| Executed as code | Remote code execution | Never execute model output directly |
| Inserted into SQL queries | SQL injection | Parameterise all queries; validate schema |
| Used in HTTP requests | SSRF | Validate and allowlist URLs from model output |
| Passed to shell commands | Command injection | Never pass model output to shell |
| Used as a file path | Path traversal | Validate against allowlist of permitted paths |
| Used for access control decisions | Privilege escalation | Never use model output for authorisation alone |
Supply chain:
Data leakage:
## AI Security Review: [Feature Name]
### Attack Surface Summary
[Inputs, model access, tools available, output usage]
### Threat Findings
| ID | OWASP LLM Category | Severity | Description | Mitigation |
|----|--------------------|----------|-------------|------------|
| AI-001 | LLM01: Prompt Injection | HIGH | [Description] | [Concrete fix] |
### Mitigations Required Before Release
[Priority list with owners and references]
### Accepted Risks
[Any risks accepted with justification and approver]
| Excuse | Counter |
|---|---|
| "The model won't do harmful things — it's aligned" | Alignment is not a security boundary. Prompt injection bypasses alignment systematically. |
| "Our users are trusted — no injection risk" | Indirect injection comes from retrieved documents, not users. Malicious content in your RAG source is an injection vector. |
| "We validate the model output in the UI" | XSS prevention in the UI is correct but insufficient. Validate at every trust boundary, not just display. |
| "It's a read-only agent — no write tools" | Is it truly read-only? Check every tool definition. HTTP GET requests can trigger side effects in external systems. |
| "We use a well-known model — supply chain is fine" | Supply chain risk includes fine-tunes, LoRA adapters, embedding models, and model API intermediaries — not just the base model. |
| "We'll add rate limiting later" | LLM cost exhaustion attacks (LLM10) are cheaper than traditional DoS. Rate limit before you ship. |
Do not close this review until:
docs/threat-model.md)grc-analyst for GDPR/compliance implications