From soundcheck
Prevents model theft in LLM inference endpoints via checks for authentication requirements, per-user rate limits, stripped logprobs/embeddings, and extraction pattern monitoring.
npx claudepluginhub thejefflarson/soundcheck --plugin soundcheckThis skill uses the workspace's default tool permissions.
Prevents unauthorized replication of proprietary models through API abuse. Unauthenticated
Flags Model DoS vulnerabilities in LLM API handlers like unbounded prompts, missing max_tokens, unbounded context, and no rate limiting. Suggests fixes and verification checklist.
Audits LLM and GenAI applications for OWASP Top 10 2025 vulnerabilities including prompt injection, data leakage, supply chain risks, and more. Use before deployment, for RAG reviews, or pen testing.
Integrates local LLMs using llama.cpp and Ollama with secure model loading, inference optimization, prompt handling, and defenses against prompt injection, model theft, and DoS attacks. Ideal for privacy-focused AI inference.
Share bugs, ideas, or general feedback.
Prevents unauthorized replication of proprietary models through API abuse. Unauthenticated or unthrottled inference endpoints let attackers systematically query a model to reconstruct its weights or distill a clone — stealing the commercial and IP value of the deployment.
logprobs or full embedding vectors, enabling extractionFlag the vulnerable code and explain the risk. Then suggest a fix that establishes these properties:
logprobs, full
embedding vectors, and per-token probabilities are the primary signals
distillation attacks use to reconstruct a model. If a caller doesn't strictly
need them, don't return them.Anchor — shape, not implementation:
require(valid_api_key(request)) # authenticated
require(rate_limit.allow(request.user_id)) # per-user, not per-IP
result = model.generate(prompt)
log_query(user_id, prompt, result.tokens)
detect_extraction_pattern(user_id, prompt) # entropy-based alert
return { text: result.text } # no logprobs, no embeddings
logprobs, raw embeddings, and weight data are excluded from API responses