Help us improve
Share bugs, ideas, or general feedback.
From builder-ai
Use before merging any PR that adds an LLM API call. Every call must handle timeout, malformed output, low confidence, and refusal — with a defined, user-safe fallback for each. Blocks "add error handling later" completions.
npx claudepluginhub rbraga01/a-team --plugin builder-aiHow this skill is triggered — by the user, by Claude, or both
Slash command
/builder-ai:fallback-requiredThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
```
Provides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.
Share bugs, ideas, or general feedback.
LLM CALLS WITHOUT FALLBACKS ARE TICKING FAILURES.
Every model times out. Every model returns garbage sometimes.
"The model is reliable" is a claim about averages — users experience tails.
A defined, tested fallback path for each failure mode IS reliability.
Trigger on every PR that:
Every LLM call must handle all four:
| Failure Mode | What Happens | Required Response |
|---|---|---|
| Timeout / API error | Network down, provider outage, slow response | Retry with exponential backoff (max 3), then graceful degradation |
| Malformed output | Wrong format, truncated JSON, schema violation | Schema validation → fallback to rule-based default |
| Low confidence | Model expresses uncertainty, output score below threshold | Route to fallback model, simpler rule, or human review |
| Refusal | Model declines to answer, content filter triggered | Detect refusal pattern → user-friendly error, do not surface raw refusal |
Before writing the LLM call, answer: what does this feature return when the model fails?
The fallback must be:
async def call_llm(prompt: str) -> Result:
for attempt in range(MAX_RETRIES):
try:
response = await llm.complete(
prompt, timeout=TIMEOUT_SECONDS
)
parsed = parse_and_validate(response) # raises OutputParseError on bad schema
if parsed.confidence < CONFIDENCE_THRESHOLD: # default 0.7; use 0.85 for high-stakes domains
log_fallback("low_confidence", attempt)
return fallback_result(reason="low_confidence")
return parsed
except TimeoutError:
if attempt == MAX_RETRIES - 1:
log_fallback("timeout", attempt)
return fallback_result(reason="timeout")
await backoff(attempt)
except OutputParseError:
log_fallback("malformed_output", attempt)
return fallback_result(reason="malformed_output")
except RefusalError:
log_fallback("refused", attempt)
return fallback_result(reason="refused")
return fallback_result(reason="max_retries_exceeded")
def test_returns_fallback_on_timeout():
with mock_llm_timeout():
result = call_llm("...")
assert result.is_fallback is True
assert result.reason == "timeout"
def test_returns_fallback_on_malformed_output():
with mock_llm_response("not valid json{{{"):
result = call_llm("...")
assert result.is_fallback is True
A fallback without a test is a promise, not an implementation.
Set an alert if fallback rate exceeds threshold (e.g., > 5% of calls in 5 min). High fallback rates signal prompt regressions, provider incidents, or input distribution shifts — none of which should be silent.
These thoughts mean fallback handling is incomplete — stop:
When fallback-required is satisfied, state it like this:
Fallbacks implemented.
Timeout/API error: retry (max N, backoff Xs–Ys), then fallback_result("timeout") ✓
Malformed output: schema validation → fallback_result("malformed_output") ✓
Low confidence: threshold = X (default 0.7; 0.85 for medical/legal/financial) → fallback_result("low_confidence") ✓
Refusal: refusal pattern detection → fallback_result("refused") ✓
Tests: 4 failure-mode tests passing ✓
Fallback logging: reason field → <log destination> ✓
Alert: fallback rate > N% triggers <alert channel> ✓
All four modes required. A partially-handled call is an unhandled call.
LLM products fail differently than deterministic software. Timeouts spike under load. Output schemas break when models update. Confidence degrades on edge-case inputs. The fallback IS the product's reliability — the model is just the happy path.