From superhackers
Use when creating new superhackers security skills, editing existing security skills, reviewing skill quality before deployment, or when a gap is discovered during a security engagement and needs to be captured as a reusable skill.
npx claudepluginhub narlyseorg/superhackers --plugin superhackersThis skill uses the workspace's default tool permissions.
> This is a meta-skill for creating new skills. It requires no external security tools — only file system access to create and edit skill files.
Retrieves texts, DMs, one-time codes, and inspects threads in ECC workflows. Provides evidence of exact sources checked for verification before replies.
Delivers expertise for HS tariff classification, customs documentation, duty optimization, restricted party screening, and trade compliance across jurisdictions.
Process documents with Nutrient API: convert formats (PDF, DOCX, XLSX, images), OCR scans (100+ languages), extract text/tables, redact PII, sign, fill forms.
This is a meta-skill for creating new skills. It requires no external security tools — only file system access to create and edit skill files.
| Tool | Required | Purpose | Fallback |
|---|---|---|---|
| File editor | ✅ Yes | Creating and editing SKILL.md files | Manual editing |
| ripgrep (rg) | ⚡ Optional | Checking existing skill patterns for consistency | grep → find |
When creating or editing skill files, follow this protocol:
Verify file operations succeed
# Create new skill file with validation
SKILL_DIR="skills/my-new-skill"
SKILL_FILE="$SKILL_DIR/SKILL.md"
mkdir -p "$SKILL_DIR" 2>/dev/null
if [ $? -ne 0 ]; then
echo "TOOL_FAILURE: Cannot create directory: $SKILL_DIR"
exit 1
fi
# Write skill file
cat > "$SKILL_FILE" << 'EOF'
---
name: my-new-skill
description: "Use when..."
---
# Skill content here
EOF
if [ $? -ne 0 ]; then
echo "TOOL_FAILURE: Cannot write to file: $SKILL_FILE"
exit 1
fi
echo "SUCCESS: Skill file created"
Pattern validation with error handling
# Check for conflicting skill names
NEW_SKILL_NAME="my-new-skill"
EXISTING_SKILL=$(rg -l "^name: $NEW_SKILL_NAME" skills/*/SKILL.md 2>/dev/null)
if [ -n "$EXISTING_SKILL" ]; then
echo "WARNING: Skill name '$NEW_SKILL_NAME' already exists"
echo "Existing: $EXISTING_SKILL"
echo "Use a different name or update the existing skill"
fi
# Check for pattern consistency
rg "TODO|FIXME|XXX" "$SKILL_FILE" 2>/dev/null
if [ $? -eq 0 ]; then
echo "WARNING: Found TODO/FIXME markers in skill file"
echo "Address these before finalizing"
fi
Writing security skills IS scenario-driven testing applied to security documentation.
You write attack scenarios (pressure tests with subagents), watch them fail (baseline behavior without the skill), write the skill (operational methodology), watch scenarios pass (agents follow the skill), and refactor (close gaps where agents deviate or miss steps).
Core principle: If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.
Every security skill must answer: "Given this target/scenario, what exact commands do I run, in what order, and what do I do with the results?"
A security skill is an operational reference for proven attack techniques, assessment methodologies, security tools, or engagement workflows. Skills help future agents find and apply effective security approaches without re-deriving them.
Security skills are: Reusable techniques, attack methodologies, tool references, engagement workflows
Security skills are NOT:
| Security Scenario Testing | Skill Creation |
|---|---|
| Attack scenario | Pressure test with subagent |
| Operational guide | Skill document (SKILL.md) |
| Scenario fails (RED) | Agent misses attack vectors without skill (baseline) |
| Scenario passes (GREEN) | Agent executes methodology correctly with skill |
| Refactor | Close gaps while maintaining coverage |
| Write scenario first | Run baseline scenario BEFORE writing skill |
| Watch it fail | Document exact attack vectors agent misses |
| Minimal skill | Write skill addressing those specific gaps |
| Watch it pass | Verify agent now covers all vectors |
| Refactor cycle | Find new gaps → plug → re-verify |
The entire skill creation process follows RED-GREEN-REFACTOR adapted for security operations.
Create when:
Don't create for:
Concrete attack method with exact commands and decision points. Examples: SQL injection via error-based extraction, JWT algorithm confusion attacks, SSRF cloud metadata harvesting.
Structured assessment approach for a target type or security domain. Examples: webapp-pentesting (OWASP Top 10 assessment), api-pentesting (API security testing), android-pentesting (mobile app analysis).
Tool documentation, payload collections, protocol specifications. Examples: msfvenom payload quick reference, hash identification guide, HTTP security headers reference.
End-to-end engagement process spanning multiple phases. Examples: using-superhackers (skill router), security-assessment (engagement orchestrator), writing-security-reports (deliverable production).
skills/
skill-name/
SKILL.md # Main reference (required)
supporting-file.* # Only if needed
Flat namespace — all skills in one searchable namespace.
Separate files for:
Keep inline:
Frontmatter (YAML):
name and descriptionname: Use letters, numbers, and hyphens only (no parentheses, special chars)description: Third-person, describes ONLY when to use (NOT what it does)
---
name: Skill-Name-With-Hyphens
description: "Use when [specific triggering conditions, target types, attack scenarios]"
---
# Skill Name
## Overview
What is this? Core operational principle in 1-2 sentences.
Authorization assumption statement.
## When to Use
[Small inline flowchart IF routing decision non-obvious]
Bullet list with TARGET TYPES, ATTACK SCENARIOS, and use cases
When NOT to use
## Core Pattern
Attack phase sequence or methodology flow
## Quick Reference
Table with tools, commands, and phases for quick scanning
## Implementation
Detailed methodology with exact commands per phase
Organized by attack category or assessment phase
## Common Mistakes
What goes wrong in real engagements + fixes
Critical for discovery: Future agents need to FIND your skill when facing a security task.
Purpose: Agents read the description to decide which skills to load. Make it answer: "Should I load this skill right now?"
CRITICAL: Description = When to Use, NOT What the Skill Does
The description should ONLY describe triggering conditions. Do NOT summarize the skill's methodology or attack sequence.
Why this matters: When a description summarizes methodology, agents may follow the description shortcut instead of reading the full skill. A description saying "tests OWASP Top 10 with nuclei scanning then manual BurpSuite verification" causes agents to run nuclei + BurpSuite and skip the skill's full phase-by-phase methodology.
The trap: Descriptions that summarize attack methodology create a shortcut agents will take. The skill body becomes documentation agents skip.
# ❌ BAD: Summarizes methodology - agents may follow this instead
description: "Use when pentesting webapps - runs nmap, then nuclei scan, then manual BurpSuite testing of OWASP Top 10"
# ❌ BAD: Too abstract
description: "For security testing"
# ✅ GOOD: Just triggering conditions, no methodology summary
description: "Use when testing web applications for security vulnerabilities, performing webapp penetration tests, assessing OWASP Top 10 risks, testing for XSS/SQLi/CSRF/SSRF"
# ✅ GOOD: Target types and attack scenarios only
description: "Use when needing to exploit a confirmed vulnerability, generate payloads, craft reverse shells, use Metasploit modules, write custom exploit scripts"
Content guidelines:
Use words agents would search for:
Use attack-action verbs or target descriptors:
webapp-pentesting not web-securityexploit-development not exploitation-techniquesrecon-and-enumeration not information-gatheringGerunds (-ing) work well for ongoing processes:
writing-security-reports, webapp-pentestingProblem: Router skills and frequently-loaded skills consume context in every engagement. Every token counts.
Target word counts:
Techniques:
Reference tool documentation, don't duplicate it:
# ❌ BAD: Document all sqlmap flags in SKILL.md
sqlmap supports --batch, --dbs, --tables, --dump, --level, --risk, --tamper...
# ✅ GOOD: Show the command patterns agents actually need
sqlmap -u "URL" --batch --level 3 --risk 2 -o # Standard scan
sqlmap -u "URL" --batch --dbs # Enumerate databases
Use cross-references instead of repeating content:
# ❌ BAD: Repeat methodology from another skill
When exploiting confirmed findings, set up Metasploit handler...
[30 lines of repeated exploit-development content]
# ✅ GOOD: Reference the other skill
**REQUIRED SUB-SKILL:** Use superhackers:exploit-development for exploitation phase.
Compress examples — show the command, not the narrative:
# ❌ BAD: Verbose narrative (45 words)
First, we need to scan the target for open ports using nmap.
We'll use the -sV flag for version detection and -sC for default scripts.
# ✅ GOOD: Just the command with inline comment (15 words)
nmap -sV -sC -p- -oN scan.txt TARGET # Full port scan + version detection
Eliminate redundancy:
Use skill name only, with explicit requirement markers:
**REQUIRED SUB-SKILL:** Use superhackers:exploit-development**REQUIRED SUB-SKILL:** Use superhackers:vulnerability-verification for confirming findings.See skills/exploit-development/SKILL.md (unclear if required)@skills/exploit-development/SKILL.md (force-loads, burns context)Why no @ links: @ syntax force-loads files immediately, consuming 200k+ context before you need them.
digraph when_flowchart {
"Need to show information?" [shape=diamond];
"Decision where agent might choose wrong?" [shape=diamond];
"Use markdown" [shape=box];
"Small inline flowchart" [shape=box];
"Need to show information?" -> "Decision where agent might choose wrong?" [label="yes"];
"Decision where agent might choose wrong?" -> "Small inline flowchart" [label="yes"];
"Decision where agent might choose wrong?" -> "Use markdown" [label="no"];
}
Use flowcharts ONLY for:
Never use flowcharts for:
Graphviz conventions: Use dot format. Decision nodes: [shape=diamond]. Action nodes: [shape=box]. Semantic labels on edges (never "step1", "step2").
The Triad: Vulnerable Code → Exploit → Fix
Every security skill with code examples MUST show three components:
### Example: SQL Injection in Login Form
**❌ VULNERABLE:**
```python
# Direct string concatenation — SQL injection
query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
cursor.execute(query)
💀 EXPLOIT:
# Authentication bypass via SQL injection
sqlmap -u "https://target.com/login" --data="username=admin&password=test" --batch --level 3
# Manual: username = admin' OR '1'='1' --
✅ FIX:
# Parameterized query — prevents SQL injection
query = "SELECT * FROM users WHERE username=%s AND password=%s"
cursor.execute(query, (username, password))
**Why the triad matters:**
- Vulnerable code shows WHAT to look for during assessment
- Exploit shows HOW to confirm the vulnerability
- Fix shows WHAT to recommend in the report
**Guidelines:**
1. **One excellent triad beats many mediocre ones** — pick the most representative attack
2. **Use the language matching the target** — PHP for web apps, Python for scripts, bash for tools
3. **Exploits must be realistic** — real payloads, real tool commands
4. **Fixes must be production-ready** — parameterized queries, not "sanitize input"
5. **Never show destructive exploits without context** — always include scope/authorization reminders
## Tool Integration Documentation
When documenting security tools in a skill:
```markdown
### Tool: sqlmap
**Purpose:** Automated SQL injection detection and exploitation
**Common Invocations:**
```bash
sqlmap -u "URL?param=value" --batch --level 3 --risk 2 -o # Detection
sqlmap -u "URL?param=value" --batch --dbs # Enumerate DBs
sqlmap -u "URL?param=value" --tamper=space2comment --batch # WAF bypass
Integration: Proxy via BurpSuite (--proxy), use saved requests (-r file.txt), combine with ffuf/recon output.
**Rules:**
- Show 2-3 most common invocations, not every flag
- Always include `--batch` or equivalent for non-interactive usage
- Reference `--help` for comprehensive flag documentation
- Show how the tool integrates with other tools in the methodology
## CVSS-Aware Severity Sections
When a skill documents vulnerabilities, include CVSS context:
| Severity | CVSS Range | Example Findings |
|----------|-----------|------------------|
| Critical | 9.0-10.0 | RCE, auth bypass to admin, SQLi with data dump |
| High | 7.0-8.9 | Stored XSS, SSRF to internal services, privilege escalation |
| Medium | 4.0-6.9 | Reflected XSS, CSRF on non-critical functions, info disclosure |
| Low | 0.1-3.9 | Missing headers, verbose errors, clickjacking |
| Info | 0.0 | Observations, best practice recommendations |
**CVSS Vector String Format:** `CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H`
**Always include:** CVSS v4.0 score with vector string, business impact context (not just technical severity), and remediation priority based on exploitability.
**When to include severity guidance:**
- Methodology skills covering vulnerability classes (always)
- Technique skills (include for the demonstrated vulnerability)
- Reference skills (only if covering vulnerability categorization)
- Workflow skills (never — severity belongs in methodology skills)
## OWASP/CWE/CVE Cross-Referencing
**Standard references make skills authoritative and searchable.**
```markdown
**For vulnerability classes:**
- **OWASP Top 10:** A03:2021 — Injection
- **CWE:** CWE-89 (SQL Injection), CWE-79 (XSS), CWE-918 (SSRF)
**For specific vulnerabilities:**
- **CVE:** CVE-2021-44228 (Log4Shell)
- **CVSS:** 10.0 (CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H)
**For methodology alignment:**
- **OWASP Testing Guide:** WSTG-INPV-05 (SQL Injection)
- **PTES:** Vulnerability Analysis phase
Rules:
Verify scores programmatically with Python cvss library: from cvss import CVSS4
Self-Contained: jwt-attacks/SKILL.md — everything inline. Use when all methodology and payloads fit in one file.
With Payload Collection: xss-filter-bypass/SKILL.md + payloads.txt — separate file when payload list exceeds 50 lines.
With Heavy Reference: webapp-pentesting/SKILL.md + nuclei-templates/ + wordlists/ — separate supporting files when they're reusable tools, not just narrative.
NO SKILL WITHOUT A FAILING SCENARIO FIRST
This applies to NEW skills AND EDITS to existing skills.
Write skill before testing? Delete it. Start over. Edit skill without testing? Same violation.
No exceptions:
Examples: SQL injection methodology, JWT attacks, SSRF exploitation Test with: Target application scenarios, WAF bypass variations, methodology gap detection, tool failure fallbacks. Success criteria: Agent identifies and exploits the vulnerability following the skill's methodology.
Examples: webapp-pentesting, api-pentesting, infra-pentesting Test with: Coverage scenarios (all categories tested?), phase transition tests, completeness checks, time pressure scenarios. Success criteria: Agent completes full methodology without skipping phases.
Examples: msfvenom payload reference, hash identification guide Test with: Retrieval scenarios, correct application in context, gap testing, accuracy verification. Success criteria: Agent finds and correctly applies reference information.
Examples: using-superhackers, security-assessment, writing-security-reports Test with: Routing accuracy, phase sequencing, ambiguous request handling, context carry-forward. Success criteria: Agent routes to correct skills and follows engagement flow.
| Excuse | Reality |
|---|---|
| "The commands are obviously correct" | Commands with wrong flags or outdated syntax break engagements. Test them. |
| "It's just a tool reference" | References with gaps cause agents to miss attack vectors. Test retrieval. |
| "Testing security skills is dangerous" | Test against intentionally vulnerable targets (DVWA, HackTheBox). No excuse. |
| "I'll test if problems emerge" | Problems = missed vulns in real engagements. Test BEFORE deploying. |
| "Too tedious to test" | Less tedious than missing a critical finding because the skill had a gap. |
| "I'm a security expert, it's fine" | Overconfidence causes missed attack vectors. Test anyway. |
| "The OWASP guide covers this" | Your skill adds operational specifics beyond OWASP. Those need testing. |
| "No time to test" | Deploying untested skills wastes more time when agents miss findings. |
All of these mean: Test before deploying. No exceptions.
Skills that enforce engagement methodology need to resist shortcuts. Agents are efficient and will skip phases when under pressure.
Don't just state the methodology — forbid specific shortcuts:
## ❌ BAD
Run reconnaissance before testing.
## ✅ GOOD
Run reconnaissance before testing.
**No exceptions:**
- Don't skip recon because "the user already gave me the URL"
- Don't skip recon because "I can see it's a WordPress site"
- Don't skip recon because "the user wants a quick check"
- A URL is not reconnaissance. Enumerate the full attack surface.
Capture rationalizations from baseline testing. Every excuse agents make goes in a table:
| Excuse | Reality |
|--------|---------|
| "User said just check for SQLi" | Check for SQLi AND everything else in scope. |
| "Scanner found nothing" | Scanners miss business logic, auth bypass, chained vulns. |
| "It's just a quick check" | Quick checks still follow methodology. Skip phases = miss findings. |
## Red Flags — STOP and Reassess
- Skipping reconnaissance
- Reporting scanner output without verification
- Not testing authenticated functionality
- Testing only one vulnerability class
- "The scanner didn't find anything, so it's secure"
**All of these mean: Go back and follow the full methodology.**
Run attack scenario with subagent WITHOUT the skill. Document exact behavior:
This is "watch the scenario fail" — you must see what agents naturally miss before writing the skill.
Write skill that addresses those specific gaps. Don't add content for hypothetical attack scenarios not seen in baseline.
Run same scenarios WITH skill. Agent should now cover all vectors.
Agent found a new way to shortcut the methodology? Add explicit counter. Re-test until comprehensive.
"During the 2025 assessment of client X, we discovered..." Why bad: Too specific, not reusable, potentially violates confidentiality
Document every flag for nmap, sqlmap, nuclei, ffuf... Why bad: Tool manuals exist. Document the combinations agents need, not the reference.
Just a list of XSS payloads without methodology for when/how to use them. Why bad: Agents need decision logic, not just ammunition
step1 [label="scan"];
step2 [label="test"];
Why bad: Labels must have semantic meaning — "Scan with nuclei for known CVEs"
Exploit examples without scope/authorization reminders. Why bad: Security skills must always remind agents to verify authorization
After writing ANY skill, you MUST STOP and complete the validation process.
Do NOT:
The creation checklist below is MANDATORY for EACH skill.
Deploying untested security skills = deploying untested exploit chains. Missed attack vectors in real engagements.
IMPORTANT: Use TodoWrite to create todos for EACH checklist item below.
RED Phase — Write Failing Scenario:
GREEN Phase — Write Minimal Skill:
REFACTOR Phase — Close Gaps:
Quality Checks:
REQUIRED SUB-SKILL: Use superhackers:skill-nameDeployment:
During Engagements: Agent encounters gap → checks existing skills → completes engagement ad-hoc → captures approach as new skill AFTER engagement.
Systematic Gap Analysis: Review OWASP Testing Guide sections → map to existing skills → identify unmapped sections → prioritize by frequency × impact.
From Post-Mortems: Review completed reports → identify novel methodology → extract into reusable skill → test against similar targets.
How future agents find your skill: Encounters task → loads using-superhackers → finds YOUR SKILL via description match → scans overview → reads core pattern → follows implementation.
Optimize for this flow — put searchable vulnerability classes, tool names, and attack scenarios in description and overview.
Creating security skills IS scenario-driven testing for operational security documentation.
Same Iron Law: No skill without failing scenario first. Same cycle: RED (baseline gaps) → GREEN (write skill) → REFACTOR (close shortcuts). Same benefits: Better coverage, fewer missed findings, comprehensive methodology.
If you test exploit chains before deploying them, test security skills before deploying them. It's the same discipline applied to documentation.