This skill should be used when the user asks about "AI security", "ML pipeline attacks", "prompt injection", "model deserialization", "unsafe model loading", "Jupyter injection", "LLM security", or needs to identify AI/ML-specific vulnerabilities in codebases that use machine learning frameworks.
From vuln-scoutnpx claudepluginhub allsmog/vuln-scout --plugin vuln-scoutThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Detect security vulnerabilities specific to AI/ML pipelines, LLM-backed applications, and data science workflows. These attack surfaces are increasingly common and often overlooked by traditional SAST tools.
Activate this skill when reviewing code that:
The most critical ML-specific vulnerability. Many ML serialization formats execute arbitrary code on load.
Dangerous Functions:
| Framework | Dangerous | Safe Alternative |
|---|---|---|
| PyTorch | torch.load(path) | torch.load(path, weights_only=True) |
| Joblib | joblib.load(path) | Verify source, use safetensors |
| NumPy | numpy.load(path, allow_pickle=True) | numpy.load(path, allow_pickle=False) |
| Scikit-learn | joblib.load() / pickle.load() | skops.io with trusted types |
| TensorFlow | tf.saved_model.load() with custom ops | Verify model provenance |
| ONNX | Generally safe | Validate graph structure |
| SafeTensors | Safe by design | Recommended format |
Detection:
# PyTorch unsafe load
grep -rn "torch\.load(" --include="*.py" | grep -v "weights_only=True"
# Joblib/sklearn model loading
grep -rn "joblib\.load\|sklearn.*load" --include="*.py"
# NumPy with pickle enabled
grep -rn "numpy\.load\|np\.load" --include="*.py" | grep "allow_pickle"
# Generic unsafe deserialization in ML context
grep -rn "pickle\.load\|pickle\.loads\|dill\.load\|cloudpickle\.load" --include="*.py"
Exploitation: An attacker who can supply a malicious model file achieves arbitrary code execution on the server loading the model. This is especially dangerous in:
User input flowing into LLM prompts without sanitization, allowing attackers to override system instructions.
Patterns to Detect:
# Direct string formatting in prompts
grep -rn 'f".*{.*}.*prompt\|f".*{.*}.*system\|\.format(.*user' --include="*.py"
# LangChain prompt templates with user input
grep -rn "PromptTemplate\|ChatPromptTemplate\|HumanMessage" --include="*.py"
# OpenAI/Anthropic API calls with user input in system message
grep -rn "system.*content.*=.*f\"\|system.*content.*\.format" --include="*.py"
grep -rn "messages.*append\|messages.*system" --include="*.py" --include="*.ts" --include="*.js"
Vulnerable Pattern:
# User input directly in system prompt
prompt = f"You are a helpful assistant. The user's name is {user_input}. Answer their question."
response = openai.chat.completions.create(messages=[{"role": "system", "content": prompt}])
Indicators of Risk:
Untrusted .ipynb files can execute arbitrary code when opened or processed.
Detection:
# Notebook execution in pipelines
grep -rn "nbconvert\|nbclient\|ExecutePreprocessor\|execute_notebook" --include="*.py"
# Papermill execution
grep -rn "papermill\.execute\|pm\.execute" --include="*.py"
# Magic commands in notebooks
grep -rn "%system\|%sx\|!.*pip\|!.*apt\|!.*curl\|!.*wget" --include="*.ipynb"
# IPython display with JS
grep -rn "IPython\.display\.Javascript\|display\.HTML" --include="*.py" --include="*.ipynb"
Loading models from untrusted sources (user-specified repos, URLs).
Detection:
# HuggingFace from_pretrained with user-controlled repo
grep -rn "from_pretrained\|AutoModel\|AutoTokenizer\|pipeline(" --include="*.py"
# Verify if the model ID comes from user input
grep -rn "from_pretrained.*request\|from_pretrained.*params\|from_pretrained.*args" --include="*.py"
# TensorFlow Hub
grep -rn "hub\.load\|hub\.KerasLayer" --include="*.py"
# Model download from URLs
grep -rn "urllib.*model\|requests.*model.*download\|wget.*\.pt\|wget.*\.bin" --include="*.py"
Paths where an attacker can influence training data.
Detection:
# Writable training data paths
grep -rn "train.*path\|data.*dir\|dataset.*path" --include="*.py" --include="*.yaml" --include="*.yml"
# Unvalidated data pipeline inputs
grep -rn "pd\.read_csv\|pd\.read_json\|pd\.read_sql" --include="*.py" | grep -i "url\|request\|user\|input"
# S3/GCS data loading without integrity checks
grep -rn "s3://\|gs://\|blob\.download" --include="*.py" | grep -v "checksum\|hash\|verify"
grep -rn "import torch\|import tensorflow\|import sklearn\|import transformers\|import langchain\|import openai\|import anthropic" --include="*.py"
grep -rn "\.load\|from_pretrained\|load_model\|load_weights" --include="*.py"
For each loading point, determine if the source (file path, URL, repo ID) can be controlled by an attacker.
weights_only=True for PyTorch