npx claudepluginhub transilienceai/communitytoolsThis skill uses the workspace's default tool permissions.
Test LLM applications for OWASP LLM Top 10 vulnerabilities using 10 specialized agents. Use for authorized AI security assessments.
reference/llm01-prompt-injection.mdreference/llm02-insecure-output.mdreference/llm03-training-poisoning.mdreference/llm04-resource-exhaustion.mdreference/llm05-supply-chain.mdreference/llm06-excessive-agency.mdreference/llm07-model-extraction.mdreference/llm08-vector-poisoning.mdreference/llm09-overreliance.mdreference/llm10-logging-bypass.mdAudits LLM applications for vulnerabilities using OWASP Top 10 for LLMs, threat modeling, penetration testing, and compliance with NIST AI RMF and ISO 42001.
Audits LLM and GenAI applications for OWASP Top 10 2025 vulnerabilities including prompt injection, data leakage, supply chain risks, and more. Use before deployment, for RAG reviews, or pen testing.
Scans LLMs for vulnerabilities using NVIDIA garak's 179 probes across 35 families via Rails UI with multi-tenant support, scheduling, PDF reports, and SIEM integration.
Share bugs, ideas, or general feedback.
Test LLM applications for OWASP LLM Top 10 vulnerabilities using 10 specialized agents. Use for authorized AI security assessments.
1. Specify target (LLM app URL, API endpoint, or local model)
2. Select scope: Full OWASP Top 10 | Specific vulnerability | Supply chain
3. Agents deploy, test, capture evidence
4. Professional report with PoCs generated
Each agent targets one OWASP LLM vulnerability:
See reference/llm0X-*.md for attack playbooks.
Full Assessment (4-8 hours):
- [ ] Reconnaissance
- [ ] Deploy all 10 agents
- [ ] Execute exploits
- [ ] Capture evidence
- [ ] Generate report
Focused Testing (1-3 hours):
- [ ] Select vulnerability (LLM01-10)
- [ ] Deploy agent
- [ ] Execute techniques
- [ ] Document findings
Supply Chain Audit (2-4 hours):
- [ ] Inventory dependencies
- [ ] Scan CVEs
- [ ] Test plugins/APIs
- [ ] Verify model provenance
Enhances /pentest with AI-specific testing:
Prompt Injection: Instruction override, system prompt extraction, filter evasion
Model Extraction: Query sampling, token analysis, membership inference
Data Poisoning: Behavioral anomalies, backdoor triggers, bias analysis
DoS: Token flooding, recursive expansion, context exhaustion
Supply Chain: CVE scanning, plugin audit, model verification
MCP Tool Abuse: MCP server inspectors/debuggers often expose /api/mcp/connect or similar endpoints that accept serverConfig with arbitrary command parameters — unauthenticated RCE. Check for MCP Inspector, MCP Playground, or any MCP debugging UI on non-standard ports (6274, 3000, etc.).
All agents collect: screenshots, network logs, API responses, errors, console output, execution metrics.
Automated reports include: executive summary, detailed findings (CVSS scores), PoC scripts, evidence, remediation guidance.
/pentest skill for comprehensive security testing/AGENTS.mdreference/llm0X-*.md