From dh
Reviews LLM integration code for prompt hygiene, Claude model selection, context management, token economics, structured output validation, evaluation, and safety. Auto-loads for AI/ML code reviews.
npx claudepluginhub jamie-bitflight/claude_skills --plugin development-harnessThis skill uses the workspace's default tool permissions.
Stack-specific rules loaded by `dh:code-reviewer` when prompt files, model selection logic, or evaluation harness code are detected.
Audits AI-generated code and LLM applications for security vulnerabilities, covering OWASP Top 10 for LLMs, secure coding patterns, and AI-specific threat models.
Detects prompt injection vulnerabilities in LLM code constructing prompts from user input, system prompts, RAG pipelines, or external data. Suggests fixes with trust tiers, delimiters, input screening, and output validation.
Applies LangChain security best practices: secrets management, prompt injection defense, safe tool execution, and LLM output validation for production apps.
Share bugs, ideas, or general feedback.
Stack-specific rules loaded by dh:code-reviewer when prompt files, model selection logic, or evaluation harness code are detected.
model = "haiku" # retrieval only, no reasoning requiredsonnet, haiku, opus) so upgrades require one change# WRONG: hardcoded version string
model = "claude-haiku-4-5"
# RIGHT: tier alias — version resolved by the client
model = "claude-haiku-latest"
# or better: configurable
model = config.model_tier # "haiku" | "sonnet" | "opus"
json.loads(response) without validation is a blocking findingtemperature=0 is required for deterministic tasks (classification, extraction, code generation with tests) — any other value is a blocking findingtemperature>0 is required for creative tasks (variation generation, brainstorming) — using 0 eliminates variation intentionally# RIGHT: documented temperature
response = client.messages.create(
model="claude-sonnet-latest",
temperature=0, # deterministic — this is a classification task
messages=[...]
)
429 (rate limited) with backoff is correct400 (bad request, context length exceeded) is a blocking finding — these errors are not transient and retrying wastes budget# WRONG: user input in system prompt
system = f"You are a helpful assistant. The user's name is {user_name}."
# RIGHT: user data in user turn only
system = "You are a helpful assistant."
messages = [{"role": "user", "content": f"My name is {user_name}. ..."}]
# WRONG: retry on context limit
for attempt in range(3):
try:
return client.messages.create(...)
except APIError: # catches 400 context limit AND 429 rate limit
time.sleep(2 ** attempt)
# RIGHT: only retry transient errors
for attempt in range(3):
try:
return client.messages.create(...)
except RateLimitError:
time.sleep(2 ** attempt + random.random())
except APIError:
raise # non-retryable — propagate immediately