From cohere-pack
Runs Cohere API v2 production deployment checklist: auth, code quality, models, performance, health checks, rollback for go-live.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin cohere-packThis skill is limited to using the following tools:
Complete go-live checklist for deploying Cohere API v2 integrations to production with safety gates, health checks, and rollback procedures.
Deploys Cohere API v2 apps to Vercel, Fly.io, and Google Cloud Run with secrets management, health checks, and platform configs.
Executes Mistral AI production deployment checklist: credential verification, code quality gates, health endpoints, circuit breakers, gradual rollouts, and rollback procedures.
Executes production checklist for Anthropic Claude API integrations: auth/keys, error handling, rate limits/costs, reliability, observability. Use before launch.
Share bugs, ideas, or general feedback.
Complete go-live checklist for deploying Cohere API v2 integrations to production with safety gates, health checks, and rollback procedures.
CO_API_KEY stored in secret manager (Vault, AWS Secrets Manager, GCP Secret Manager)CohereClientV2, not CohereClient)model parameter explicitlyembeddingTypes set for all Embed calls (required for v3+)inputType set for all Embed calls (required for v3+)CohereError and CohereTimeoutError| Use Case | Recommended Model | Fallback |
|---|---|---|
| Chat/generation | command-a-03-2025 | command-r-plus-08-2024 |
| Lightweight chat | command-r7b-12-2024 | command-r-08-2024 |
| Embeddings | embed-v4.0 | embed-english-v3.0 |
| Reranking | rerank-v3.5 | rerank-english-v3.0 |
chatStream)maxTokens set to prevent runaway generation costs// /api/health
import { CohereClientV2, CohereError } from 'cohere-ai';
const cohere = new CohereClientV2();
export async function GET() {
const start = Date.now();
let cohereStatus: 'healthy' | 'degraded' | 'down' = 'down';
try {
// Cheapest possible health check — minimal chat
await cohere.chat({
model: 'command-r7b-12-2024',
messages: [{ role: 'user', content: 'ping' }],
maxTokens: 1,
});
cohereStatus = 'healthy';
} catch (err) {
if (err instanceof CohereError && err.statusCode === 429) {
cohereStatus = 'degraded'; // Rate limited but reachable
}
}
return Response.json({
status: cohereStatus === 'healthy' ? 'ok' : 'degraded',
cohere: {
status: cohereStatus,
latencyMs: Date.now() - start,
},
timestamp: new Date().toISOString(),
});
}
class CohereCircuitBreaker {
private failures = 0;
private lastFailure = 0;
private state: 'closed' | 'open' | 'half-open' = 'closed';
constructor(
private threshold = 5,
private resetMs = 60_000
) {}
async call<T>(fn: () => Promise<T>, fallback?: () => T): Promise<T> {
if (this.state === 'open') {
if (Date.now() - this.lastFailure > this.resetMs) {
this.state = 'half-open';
} else if (fallback) {
return fallback();
} else {
throw new Error('Cohere circuit breaker is open');
}
}
try {
const result = await fn();
this.failures = 0;
this.state = 'closed';
return result;
} catch (err) {
this.failures++;
this.lastFailure = Date.now();
if (this.failures >= this.threshold) {
this.state = 'open';
console.error(`Cohere circuit breaker OPEN after ${this.failures} failures`);
}
throw err;
}
}
}
const breaker = new CohereCircuitBreaker();
# Pre-flight
curl -sf https://staging.example.com/api/health | jq '.cohere'
curl -s https://status.cohere.com/api/v2/status.json | jq '.status'
# Deploy with canary (10% traffic)
kubectl apply -f k8s/production.yaml
kubectl rollout pause deployment/app
# Monitor for 10 minutes: error rate, latency, 429s
# Check: No increase in CohereError rate
# Check: P95 latency < 5s for chat, < 500ms for embed/rerank
# Proceed to 100%
kubectl rollout resume deployment/app
kubectl rollout status deployment/app
| Alert | Condition | Severity |
|---|---|---|
| Cohere unreachable | Health check fails 3x | P1 |
| High error rate | 5xx > 5% of requests/5min | P1 |
| Rate limited | 429 > 10/min | P2 |
| High latency | Chat P95 > 10s | P2 |
| Auth failure | Any 401 response | P1 |
| Budget exceeded | Daily token cost > threshold | P2 |
# Immediate rollback
kubectl rollout undo deployment/app
kubectl rollout status deployment/app
# Verify rollback
curl -sf https://api.example.com/api/health | jq '.cohere'
For version upgrades, see cohere-upgrade-migration.