From perplexity-pack
Executes Perplexity Sonar API production checklist: API config, code quality, performance caching, monitoring, cost controls for live deployments.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin perplexity-packThis skill is limited to using the following tools:
Complete checklist for deploying Perplexity Sonar API integrations to production. Perplexity-specific concerns: every API call performs a live web search (variable latency), citations link to third-party sites (must validate), and costs scale per-request plus per-token.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
Complete checklist for deploying Perplexity Sonar API integrations to production. Perplexity-specific concerns: every API call performs a live web search (variable latency), citations link to third-party sites (must validate), and costs scale per-request plus per-token.
PERPLEXITY_API_KEY in secret manager (not env file)pplx- and has credits loadedhttps://api.perplexity.ai (not localhost/proxy)sonar for fast, sonar-pro for deepmax_tokens set on all requests (prevents runaway costs)search_domain_filter used where appropriate (reduces search time)sonar, complex to sonar-promax_tokens capped per endpointasync function searchWithFallback(query: string) {
try {
// Primary: sonar-pro for deep answers
return await perplexity.chat.completions.create({
model: "sonar-pro",
messages: [{ role: "user", content: query }],
max_tokens: 2048,
});
} catch (err: any) {
if (err.status === 429 || err.status >= 500) {
// Fallback: sonar for faster, cheaper response
return await perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: query }],
max_tokens: 512,
});
}
throw err;
}
}
app.get("/health/perplexity", async (req, res) => {
const start = Date.now();
try {
const response = await perplexity.chat.completions.create({
model: "sonar",
messages: [{ role: "user", content: "ping" }],
max_tokens: 5,
});
res.json({
status: "healthy",
latencyMs: Date.now() - start,
model: response.model,
});
} catch (err: any) {
res.status(503).json({
status: "unhealthy",
error: err.status || err.message,
latencyMs: Date.now() - start,
});
}
});
| Alert | Condition | Severity |
|---|---|---|
| API Unreachable | Health check fails 3x | P1 |
| High Error Rate | 429/5xx > 5% over 5min | P2 |
| High Latency | p95 > 15s for sonar | P2 |
| Budget Exceeded | Monthly cost > 80% cap | P2 |
| Auth Failure | Any 401/402 error | P1 |
| Issue | Cause | Solution |
|---|---|---|
| Variable latency | Web search per request | Set appropriate timeouts per model |
| Broken citations | Source pages changed | Validate citation URLs before displaying |
| Cost overrun | No model routing | Route simple queries to sonar |
| Rate limit spikes | Burst traffic | Queue requests with p-queue |
For version upgrades, see perplexity-upgrade-migration.