Help us improve
Share bugs, ideas, or general feedback.
From perplexity-pack
Executes Perplexity Sonar API production checklist: API config, code quality, performance caching, monitoring, cost controls for live deployments.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin perplexity-packHow this skill is triggered — by the user, by Claude, or both
Slash command
/perplexity-pack:perplexity-prod-checklistThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Complete checklist for deploying Perplexity Sonar API integrations to production. Perplexity-specific concerns: every API call performs a live web search (variable latency), citations link to third-party sites (must validate), and costs scale per-request plus per-token.
Executes Exa production checklist for search integrations: pre-flight API/tests, security/code quality/performance/monitoring audits, health endpoint setup.
Instruments Perplexity Sonar API for monitoring latency, cost, citations, errors with TypeScript code and Prometheus export. For production dashboards and alerts.
Executes production checklist for Anthropic Claude API integrations: auth/keys, error handling, rate limits/costs, reliability, observability. Use before launch.
Share bugs, ideas, or general feedback.
Complete checklist for deploying Perplexity Sonar API integrations to production. Perplexity-specific concerns: every API call performs a live web search (variable latency), citations link to third-party sites (must validate), and costs scale per-request plus per-token.
PERPLEXITY_API_KEY in secret manager (not env file)pplx- and has credits loadedhttps://api.perplexity.ai (not localhost/proxy)sonar for fast, sonar-pro for deepmax_tokens set on all requests (prevents runaway costs)search_domain_filter used where appropriate (reduces search time)sonar, complex to sonar-promax_tokens capped per endpointasync function searchWithFallback(query: string) {
try {
// Primary: sonar-pro for deep answers
return await perplexity.chat.completions.create({
model: "sonar-pro",
messages: [{ role: "user", content: query }],
max_tokens: 2048,
});
} catch (err: any) {
if (err.status === 429 || err.status >= 500) {
// Fallback: sonar for faster, cheaper response
return await perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: query }],
max_tokens: 512,
});
}
throw err;
}
}
app.get("/health/perplexity", async (req, res) => {
const start = Date.now();
try {
const response = await perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: "ping" }],
max_tokens: 5,
});
res.json({
status: "healthy",
latencyMs: Date.now() - start,
model: response.model,
});
} catch (err: any) {
res.status(503).json({
status: "unhealthy",
error: err.status || err.message,
latencyMs: Date.now() - start,
});
}
});
| Alert | Condition | Severity |
|---|---|---|
| API Unreachable | Health check fails 3x | P1 |
| High Error Rate | 429/5xx > 5% over 5min | P2 |
| High Latency | p95 > 15s for sonar | P2 |
| Budget Exceeded | Monthly cost > 80% cap | P2 |
| Auth Failure | Any 401/402 error | P1 |
| Issue | Cause | Solution |
|---|---|---|
| Variable latency | Web search per request | Set appropriate timeouts per model |
| Broken citations | Source pages changed | Validate citation URLs before displaying |
| Cost overrun | No model routing | Route simple queries to sonar |
| Rate limit spikes | Burst traffic | Queue requests with p-queue |
For version upgrades, see perplexity-upgrade-migration.