From soundcheck
Detects and fixes resource exhaustion vulnerabilities in LLM endpoints: missing token caps, rate limits, and prompt-length bounds.
How this skill is triggered — by the user, by Claude, or both
Slash command
/soundcheck:model-dosThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Protects against resource exhaustion caused by unbounded prompts, missing token caps,
Protects against resource exhaustion caused by unbounded prompts, missing token caps, or absent rate limiting. Attackers can submit enormous or recursive inputs that inflate inference costs, saturate GPU/CPU, and deny service to legitimate users.
max_tokens parameter — model generates until its internal limitFlag the vulnerable code and explain the risk. Then suggest a fix that establishes these properties:
Translate each principle to the inference SDK, web framework, and rate-limiter of the audited file. Use the SDK's documented cap and timeout parameters — do not rely on global defaults.
Confirm the following properties hold (language-agnostic):
npx claudepluginhub thejefflarson/soundcheck --plugin soundcheckEnforces token budgets, per-user quotas, request timeouts, and loop detection in LLM applications to prevent runaway costs and denial of service.
Detects inference endpoints missing authentication or rate limiting, enabling model theft via systematic queries. Use when building or auditing LLM-serving infrastructure.
Enforces rate limiting at API gateways to protect AI models from extraction attacks. Use when designing, building, or reviewing API gateways for inference or LLM endpoints.