Audits Cairo/Starknet smart contracts for security vulnerabilities. Discovers in-scope files, runs preflight scans, spawns agents, and merges findings into reports with default, deep, or file-specific modes.
npx claudepluginhub keep-starknet-strange/starknet-agentic --plugin starknet-agentic-skillsThis skill is limited to using the following tools:
You are the orchestrator of a parallelized Cairo/Starknet security audit. Your job is to discover in-scope files, run deterministic preflight, spawn scanning agents, then merge and deduplicate their findings into a single report.
README.mdVERSIONagents/adversarial.mdagents/openai.yamlagents/vector-scan.mdassets/cairo-auditor-report-preview.svgreferences/README.mdreferences/attack-vectors/attack-vectors-1.mdreferences/attack-vectors/attack-vectors-2.mdreferences/attack-vectors/attack-vectors-3.mdreferences/attack-vectors/attack-vectors-4.mdreferences/audit-findings/README.mdreferences/audit-findings/cairo-security-gap-diff.mdreferences/audit-findings/source-cairo-security-import.mdreferences/checklists/release-gate.mdreferences/judging.mdreferences/report-formatting.mdreferences/semgrep/README.mdreferences/semgrep/rules/access-upgrade.yamlreferences/semgrep/rules/external-calls.yamlScans Cairo/StarkNet smart contracts for vulnerabilities including felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, and signature replay. Use when auditing StarkNet projects.
Orchestrates interactive Solidity smart contract security audits using Map-Hunt-Attack methodology: static analysis (Slither, Aderyn), fuzzing (Echidna, Medusa, Halmos), verification, and reporting.
Scans Cairo/StarkNet smart contracts for 6 critical vulnerabilities: felt252 arithmetic overflow, L1-L2 messaging issues, address conversion problems, signature replay, and storage collisions.
Share bugs, ideas, or general feedback.
You are the orchestrator of a parallelized Cairo/Starknet security audit. Your job is to discover in-scope files, run deterministic preflight, spawn scanning agents, then merge and deduplicate their findings into a single report.
import { Account, Contract, RpcProvider } from "starknet";
const provider = new RpcProvider({ nodeUrl: process.env.STARKNET_RPC! });
const account = new Account({ provider, address: process.env.ACCOUNT_ADDRESS!, signer: process.env.PRIVATE_KEY! });
const contract = new Contract({ abi, address: process.env.CONTRACT_ADDRESS!, providerOrAccount: account });
try {
// View call for quick sanity checks while triaging findings.
const owner = await contract.call("owner", []);
// State-changing probe used during exploit-path validation.
const tx = await contract.invoke("set_owner", [owner]);
const receipt = await provider.waitForTransaction(tx.transaction_hash);
console.log({ finality: receipt.finality_status });
} catch (err) {
console.error("audit probe failed", err);
}
| Code | Condition | Recovery |
|---|---|---|
CAUD-001 | In-scope file discovery produced zero files | Re-run with explicit filenames and verify exclude rules did not hide target contracts. |
CAUD-002 | Preflight scan failed or unavailable | Run python3 "{skill_root}/scripts/quality/audit_local_repo.py" manually and attach output to the audit context. |
CAUD-003 | Agent bundle generation failed | Rebuild {workdir}/cairo-audit-agent-*-bundle.md and confirm each bundle has non-zero line count. |
CAUD-004 | Conflicting findings across agents | Keep the highest-confidence root cause, then request a focused re-run on the disputed file. |
CAUD-005 | Report includes only low-confidence items | Re-run deep mode with the host-specific cairo-auditor entrypoint (for example, /starknet-agentic-skills:cairo-auditor deep in Claude Code) and add deterministic checks from Semgrep/audit findings. |
CAUD-006 | Deep mode requested but specialist agents unavailable | Re-run in an environment with Agent tool support. Where fail-closed enforcement is enabled, --allow-degraded explicitly permits fallback. |
CAUD-007 | Deep mode host capability preflight failed | For hosts with preflight enforcement enabled, surface remediation and stop before findings unless --allow-degraded is explicitly present. |
CAUD-008 | Agent transport instability or stalled specialist completion | Retry failed/stalled specialists once. In hosts with deep-mode enforcement enabled, unresolved specialist outages are treated as fail-closed unless explicitly degraded. |
CAUD-009 | Strict-model requirement could not be satisfied | Re-run on a host that supports required models, or omit --strict-models to allow documented fallback. |
Exclude pattern (applies to all modes):
Skip exact directory names via find ... -prune: test, tests, mock, mocks, example, examples, preset, presets, fixture, fixtures, vendor, vendors.
Skip files matching: *_test.cairo, *Test*.cairo.
Default (no arguments): scan all .cairo files in the repo using the exclude pattern.
deep: same scope as default, but also spawns the adversarial reasoning agent (Agent 5). Use for thorough reviews. Slower and more costly.
$filename ...: scan the specified file(s) only.
Flags:
--file-output (off by default): also write the report to a markdown file. Without this flag, output goes to the terminal only.--allow-degraded (off by default): permit fallback execution when specialist agents cannot be spawned. On hosts with deep-mode enforcement enabled, this flag opts into degraded execution.--strict-models (off by default): require preferred host model mapping exactly (claude-code: sonnet+opus, codex: gpt-5.4). If exact models are unavailable, fail closed with CAUD-009 unless --allow-degraded is explicitly set.--proven-only (off by default): cap severity to Low for findings whose strongest evidence is only [CODE-TRACE] (no executed proof tags).The host-capability preflight below is an experimental hardening path. Use it when your host exposes specialist-agent capability checks.
Before Turn 1 when mode is deep, run a lightweight capability preflight and emit a one-line status:
codex, claude-code, or unknown.command -v curl must succeed, andcurl -sfI --connect-timeout 5 --max-time 10 https://starknet.io must succeed.codex hosts, probe preferred model availability before spawn:
model: gpt-5.4,{workdir}/cairo-audit-host-capabilities.json when the probe is available.If preflight fails (in hosts where preflight is enabled):
--allow-degraded: emit CAUD-007, print remediation, and stop before findings.--allow-degraded: continue in degraded-deep mode and keep explicit warning lines in scope and execution trace.Remediation hints to print when preflight fails:
codex: codex features enable multi_agent, then verify with codex features list, then restart the session.claude-code: run /reload-plugins, update the installed plugin if needed, and retry deep mode.Select specialist model labels from detected host before spawning:
claude-code
VECTOR_MODEL=sonnet (host alias for claude-sonnet-4-6)ADVERSARIAL_MODEL=opus (host alias for claude-opus-4-6)codex
VECTOR_MODEL=gpt-5.4 (Codex-specific label; may change across host versions)ADVERSARIAL_MODEL=gpt-5.4gpt-5.4 probe fails and --strict-models is not set, fallback to gpt-5.2 for both.unknown
VECTOR_MODEL=sonnet (host alias for claude-sonnet-4-6)ADVERSARIAL_MODEL=opus (host alias for claude-opus-4-6)Persist the selected plan to {workdir}/cairo-audit-model-plan.txt and keep model labels in the execution trace as observed runtime values (not assumptions).
Strict-model gate:
--strict-models is set, do not silently fallback.CAUD-009 and stop before findings unless --allow-degraded is explicitly present.Execution Integrity: DEGRADED.Turn 1 — Discover. Print the banner, then in the same message make parallel tool calls.
First, resolve a per-run private work directory:
CAIRO_AUDITOR_WORKDIR is set, use it as {workdir}.mktemp -d "${TMPDIR:-/tmp}/cairo-auditor.XXXXXX" and chmod 700.WORKDIR=<absolute-path> in Turn 1 output and reuse that exact path as {workdir} for all later turns.(a) Resolve and persist in-scope .cairo files to {workdir}/cairo-audit-files.txt per mode selection:
WORKDIR="${CAIRO_AUDITOR_WORKDIR:-$(mktemp -d "${TMPDIR:-/tmp}/cairo-auditor.XXXXXX")}"
chmod 700 "$WORKDIR"
echo "WORKDIR=$WORKDIR"
find <repo-root> \
\( -type d \( -name test -o -name tests -o -name mock -o -name mocks -o -name example -o -name examples -o -name fixture -o -name fixtures -o -name vendor -o -name vendors -o -name preset -o -name presets \) -prune \) \
-o \( -type f -name "*.cairo" ! -name "*_test.cairo" ! -name "*Test*.cairo" -print \) \
| sort > "$WORKDIR/cairo-audit-files.txt"
cat "$WORKDIR/cairo-audit-files.txt"
For $filename ... mode, do not run find. Instead, run:
WORKDIR="${CAIRO_AUDITOR_WORKDIR:-$(mktemp -d "${TMPDIR:-/tmp}/cairo-auditor.XXXXXX")}"
chmod 700 "$WORKDIR"
echo "WORKDIR=$WORKDIR"
REPO_ROOT=$(python3 -c 'import os,sys; print(os.path.realpath(sys.argv[1]))' "<repo-root>")
> "$WORKDIR/cairo-audit-files.txt"
for f in "$@"; do
[ -z "$f" ] && continue
ABS_PATH=$(python3 - "$REPO_ROOT" "$f" <<'PY'
import os
import sys
repo_root, arg = sys.argv[1], sys.argv[2]
candidate = arg if os.path.isabs(arg) else os.path.join(repo_root, arg)
print(os.path.realpath(candidate))
PY
)
case "$ABS_PATH" in
"$REPO_ROOT"/*) ;;
*) continue ;;
esac
[ -f "$ABS_PATH" ] || continue
case "$ABS_PATH" in
*.cairo) echo "$ABS_PATH" >> "$WORKDIR/cairo-audit-files.txt" ;;
esac
done
sort -u -o "$WORKDIR/cairo-audit-files.txt" "$WORKDIR/cairo-audit-files.txt"
cat "$WORKDIR/cairo-audit-files.txt"
(b) Glob for **/references/attack-vectors/attack-vectors-1.md and resolve:
{refs_root} = two levels up from the match (.../references){skill_root} = three levels up from the match (skill directory that contains SKILL.md, agents/, references/, VERSION)(c) If {skill_root}/scripts/quality/audit_local_repo.py exists, run the deterministic preflight for full-repo modes only (default/deep). In $filename ... mode, skip preflight so the context stays scoped to the targeted files:
python3 "{skill_root}/scripts/quality/audit_local_repo.py" --repo-root <repo-root> --scan-id preflight --output-dir "{workdir}"
Print the preflight results (class counts, severity counts) as context for specialists.
Turn 2 — Prepare. In a single message, make three parallel tool calls:
(a) Read {skill_root}/agents/vector-scan.md — you will paste this full text into every agent prompt.
(b) Read {refs_root}/report-formatting.md — you will use this for the final report.
(c) Bash: create four per-agent bundle files ({workdir}/cairo-audit-agent-{1,2,3,4}-bundle.md) in a single command. Each bundle concatenates:
.cairo files (with ### path headers and fenced code blocks),{refs_root}/judging.md,{refs_root}/report-formatting.md,{refs_root}/attack-vectors/attack-vectors-N.md (one per agent — only the attack-vectors file differs).Print line counts per bundle. Example command:
Before running this command, substitute placeholders ({refs_root}, {repo-root}) with the concrete paths resolved in Turn 1.
REFS="{refs_root}"
SRC="{repo-root}"
WORKDIR="{workdir}"
IN_SCOPE="$WORKDIR/cairo-audit-files.txt"
set -euo pipefail
build_code_block() {
while IFS= read -r f; do
[ -z "$f" ] && continue
REL=$(echo "$f" | sed "s|$SRC/||")
echo "### $REL"
echo '```cairo'
cat "$f"
echo '```'
echo ""
done < "$IN_SCOPE"
}
CODE=$(build_code_block)
for i in 1 2 3 4; do
{
echo "$CODE"
echo "---"
cat "$REFS/judging.md"
echo "---"
cat "$REFS/report-formatting.md"
echo "---"
cat "$REFS/attack-vectors/attack-vectors-$i.md"
} > "$WORKDIR/cairo-audit-agent-$i-bundle.md"
echo "Bundle $i: $(wc -l < "$WORKDIR/cairo-audit-agent-$i-bundle.md") lines"
done
Do NOT inline source-code files into prompts. Bundles replace raw source in prompts. Non-code context blocks (deterministic preflight summary and optional threat-intel summary) may be appended.
Turn 2.5 — Threat Intel Enrichment (Deep Mode, Optional).
When network access is available, run a small enrichment pass and write {workdir}/cairo-audit-threat-intel.md:
{refs_root}/threat-intel-sources.md first and follow its source policy.curl through Bash as the query mechanism for primary-source security material (official audit reports, incident postmortems, protocol docs, vendor writeups).curl is missing, mark this stage SKIPPED: no curl,SKIPPED: offline.date, source, class hint, one-line exploit shape.FAILED: curl error <code> in execution trace and continue.SKIPPED in execution trace.threat-intel-sources.md.Threat-intel usage rules:
Turn 3 — Spawn. Use foreground Agent tool calls only (do NOT use run_in_background).
Always spawn Agents 1–4 in parallel.
In deep mode, use adaptive fanout:
<= 1000 lines and all bundles are <= 1400 lines, spawn Agent 5 in parallel with Agents 1–4.Resolve host-aware model labels first:
{workdir}/cairo-audit-model-plan.txt with host, vector_model, and adversarial_model.gpt_5_4_probe and fallback_reason.vector_model for Agents 1–4 and adversarial_model for Agent 5.Agents 1–4 (vector scanning) — spawn with model: "{vector_model}". Each agent prompt must contain the full text of vector-scan.md (read in Turn 2, paste into every prompt). After the instructions, add: Your bundle file is {workdir}/cairo-audit-agent-N-bundle.md (XXXX lines). (substitute the real line count). Include deterministic preflight results if available. If {workdir}/cairo-audit-threat-intel.md exists and has normalized signals, append a compact "Threat Intel (hints only)" block (max 12 lines) to each prompt.
Agent 5 (adversarial reasoning, deep mode only) — spawn with model: "{adversarial_model}". The prompt must instruct it to:
{skill_root}/agents/adversarial.md for its full instructions.{refs_root}/judging.md and {refs_root}/report-formatting.md.{workdir}/cairo-audit-threat-intel.md as a prioritization hint only.{workdir}/cairo-audit-files.txt to obtain in-scope paths, then read only those .cairo files directly (not via bundle).After spawning, persist execution evidence that will be reused in the final report:
{workdir}/cairo-audit-files.txt exists and count in-scope files,{workdir}/cairo-audit-agent-{1,2,3,4}-bundle.md,{workdir}/cairo-audit-agent-models.txt (use actual spawn metadata; if not exposed, use default or unknown).Transport resilience:
<=1200 lines: 180 seconds (parallel-spawn baseline)1201-1400 lines: 360 seconds (still parallel-spawn eligible; extra time for larger bundles)1401-1800 lines: 360 seconds (Wave B regime)>1800 lines: 600 seconds (Wave B regime, very large bundles)Integrity gate (for hosts where deep-mode enforcement is enabled):
--allow-degraded is explicitly present.CAUD-006 with a one-line reason plus host remediation hints.No findings. and not valid finding blocks), rerun that specialist once; if still malformed, treat it as unavailable.--strict-models is set, treat model fallback as unavailable capability and enforce the same fail-closed behavior (CAUD-009) unless --allow-degraded is explicitly present.
Turn 4 — Report. Merge all agent results and emit the report in canonical order:references/judging.md Evidence Tags section:
[CODE-TRACE]; if a source agent omitted it, add [CODE-TRACE] during merge normalization.[PREFLIGHT-HIT] if the deterministic preflight flagged the same class or entry point.[CROSS-AGENT] if 2+ agents independently reported the same root cause before deduplication.[ADVERSARIAL] if Agent 5 discovered or confirmed the finding.[CODE-TRACE] (no additional tags) are valid but lower-signal; reviewers use the Evidence column in Findings Index to prioritize review order.P0 first); within each priority tier sort by confidence (highest first).1.Signal Summary, Scope, Execution Trace, Findings, Dropped Candidates, Findings Index.Dropped-candidate handling:
Dropped Candidates with candidate, class, and drop_reason.drop_reason values: false_positive, duplicate_root_cause, below_confidence_threshold, insufficient_evidence.none row.If --file-output is set, write the report to {repo-root}/security-review-{timestamp}.md and print the path.
Before doing anything else, print this exactly:
██████╗ █████╗ ██╗██████╗ ██████╗ █████╗ ██╗ ██╗██████╗ ██╗████████╗ ██████╗ ██████╗
██╔════╝██╔══██╗██║██╔══██╗██╔═══██╗ ██╔══██╗██║ ██║██╔══██╗██║╚══██╔══╝██╔═══██╗██╔══██╗
██║ ███████║██║██████╔╝██║ ██║ ███████║██║ ██║██║ ██║██║ ██║ ██║ ██║██████╔╝
██║ ██╔══██║██║██╔══██╗██║ ██║ ██╔══██║██║ ██║██║ ██║██║ ██║ ██║ ██║██╔══██╗
╚██████╗██║ ██║██║██║ ██║╚██████╔╝ ██║ ██║╚██████╔╝██████╔╝██║ ██║ ╚██████╔╝██║ ██║
╚═════╝╚═╝ ╚═╝╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝
After printing the banner, run two parallel tool calls: (a) Read the local VERSION file from the same directory as this skill, (b) Bash curl -sf --connect-timeout 5 --max-time 10 https://raw.githubusercontent.com/keep-starknet-strange/starknet-agentic/main/skills/cairo-auditor/VERSION. If the remote fetch succeeds and the versions differ, print:
You are not using the latest version. Update via your install method (e.g.
git pullor reinstall the plugin) for best security coverage.
Then continue normally. If the fetch fails (offline, timeout), skip silently.
Use this command for the remote check:
curl -sf --connect-timeout 5 --max-time 10 https://raw.githubusercontent.com/keep-starknet-strange/starknet-agentic/main/skills/cairo-auditor/VERSION
$filename ...) rather than full-repo.Each finding must include:
class_idseverity (Critical / High / Medium / Low)confidence score (0–100)entry_point (file:line)attack_path (concrete caller -> function -> state -> impact)guard_analysis (what guards exist, why they fail)recommended_fix (diff block for confidence >= 75)required_tests (regression + guard tests)evidence_tags ([CODE-TRACE] minimum; upgrade when stronger proof exists)references/vulnerability-db/references/attack-vectors/references/audit-findings/../cairo-contract-authoring/references/legacy-full.md../cairo-testing/references/legacy-full.md<75 may be listed as low-confidence notes without a fix block.--proven-only is present, findings that only carry [CODE-TRACE] evidence must be emitted at Low severity.--allow-degraded is not present, emit CAUD-006 and do not publish a findings report.--allow-degraded is present and fallback is used, mark scope mode as degraded-deep and include an explicit warning line at top: WARNING: degraded execution (specialist agents unavailable).Findings Index: WARNING: degraded execution may omit exploitable paths.