Help us improve
Share bugs, ideas, or general feedback.
From hyrex-aidefence
Scan inputs for prompt injection, unsafe content, and adversarial attacks using AIDefence
npx claudepluginhub akhilyad/deployy --plugin hyrex-aidefenceHow this skill is triggered — by the user, by Claude, or both
Slash command
/hyrex-aidefence:safety-scan <input-text><input-text>This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Scan content for prompt injection, jailbreak attempts, and unsafe patterns.
Measures whether skills, rules, and agent definitions are actually followed by auto-generating test scenarios at 3 strictness levels and reporting compliance rates with full tool call timelines.
Share bugs, ideas, or general feedback.
Scan content for prompt injection, jailbreak attempts, and unsafe patterns.
Before processing untrusted input (user submissions, API payloads, webhook data), scan it to detect prompt injection, adversarial content, or policy violations.
mcp__hyrex__aidefence_is_safe with the input text for a boolean safe/unsafe resultmcp__hyrex__aidefence_analyze for detailed threat classification and confidence scoresmcp__hyrex__aidefence_scan for comprehensive multi-layer scanningmcp__hyrex__aidefence_learn with confirmed threats to improve detectionmcp__hyrex__aidefence_stats for detection rates and false positive metrics