From soundcheck
Flags Model DoS vulnerabilities in LLM API handlers like unbounded prompts, missing max_tokens, unbounded context, and no rate limiting. Suggests fixes and verification checklist.
npx claudepluginhub thejefflarson/soundcheck --plugin soundcheckThis skill uses the workspace's default tool permissions.
Protects against resource exhaustion caused by unbounded prompts, missing token caps,
Prevents model theft in LLM inference endpoints via checks for authentication requirements, per-user rate limits, stripped logprobs/embeddings, and extraction pattern monitoring.
Integrates local LLMs using llama.cpp and Ollama with secure model loading, inference optimization, prompt handling, and defenses against prompt injection, model theft, and DoS attacks. Ideal for privacy-focused AI inference.
Applies LangChain security best practices: secrets management, prompt injection defense, safe tool execution, and LLM output validation for production apps.
Share bugs, ideas, or general feedback.
Protects against resource exhaustion caused by unbounded prompts, missing token caps, or absent rate limiting. Attackers can submit enormous or recursive inputs that inflate inference costs, saturate GPU/CPU, and deny service to legitimate users.
max_tokens parameter — model generates until its internal limitFlag the vulnerable code and explain the risk. Then suggest a fix that establishes these properties:
max_tokens, max_output_tokens,
or the provider equivalent. Leaving it at the provider default lets a single
request run for minutes and rack up dollars in tokens.Anchor — shape, not implementation:
require(len(user_text) <= MAX_CHARS)
require(rate_limiter.allow(user_id))
history = trim(history, MAX_TURNS)
resp = llm.call(history + [user_text], max_tokens=512, timeout=30)
Confirm the following properties hold (language-agnostic):