Set up comprehensive observability for Mistral AI integrations with metrics, traces, and alerts. Use when implementing monitoring for Mistral AI operations, setting up dashboards, or configuring alerting for Mistral AI integration health. Trigger with phrases like "mistral monitoring", "mistral metrics", "mistral observability", "monitor mistral", "mistral alerts", "mistral tracing".
From mistral-packnpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin mistral-packThis skill is limited to using the following tools:
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Details PluginEval's skill quality evaluation: 3 layers (static, LLM judge), 10 dimensions, rubrics, formulas, anti-patterns, badges. Use to interpret scores, improve triggering, calibrate thresholds.
Monitor Mistral AI API usage, latency, token consumption, and costs across models.
import Mistral from '@mistralai/mistralai';
const PRICING: Record<string, { input: number; output: number }> = {
'mistral-small-latest': { input: 0.20, output: 0.60 },
'mistral-large-latest': { input: 2.00, output: 6.00 },
'mistral-embed': { input: 0.10, output: 0.00 },
};
async function trackedChat(client: Mistral, model: string, messages: any[]) {
const start = performance.now();
try {
const res = await client.chat.complete({ model, messages });
const duration = performance.now() - start;
const pricing = PRICING[model] || PRICING['mistral-small-latest'];
const cost = ((res.usage?.promptTokens || 0) / 1e6) * pricing.input
+ ((res.usage?.completionTokens || 0) / 1e6) * pricing.output;
emitMetrics({ model, duration, inputTokens: res.usage?.promptTokens, outputTokens: res.usage?.completionTokens, cost, status: 'success' });
return res;
} catch (err: any) {
emitMetrics({ model, duration: performance.now() - start, status: 'error', errorCode: err.status });
throw err;
}
}
# Key metrics to expose on /metrics endpoint
mistral_requests_total: { type: counter, labels: [model, status, endpoint] }
mistral_request_duration_ms: { type: histogram, labels: [model], buckets: [100, 250, 500, 1000, 2500, 5000] } # 5000: 2500: 1000: 250: HTTP 500 Internal Server Error
mistral_tokens_total: { type: counter, labels: [model, direction] } # direction: input|output
mistral_cost_usd_total: { type: counter, labels: [model] }
mistral_errors_total: { type: counter, labels: [model, status_code] }
# prometheus/mistral-alerts.yaml
groups:
- name: mistral
rules:
- alert: MistralHighErrorRate
expr: rate(mistral_errors_total[5m]) / rate(mistral_requests_total[5m]) > 0.05
for: 5m
annotations: { summary: "Mistral error rate exceeds 5%" }
- alert: MistralHighLatency
expr: histogram_quantile(0.95, rate(mistral_request_duration_ms_bucket[5m])) > 5000 # 5000: 5 seconds in ms
for: 5m
annotations: { summary: "Mistral P95 latency exceeds 5 seconds" }
- alert: MistralCostSpike
expr: increase(mistral_cost_usd_total[1h]) > 10
annotations: { summary: "Mistral spend exceeds $10/hour" }
Create panels for: request rate by model, p50/p95/p99 latency, token consumption by direction, hourly cost, and error rate. Use rate(mistral_tokens_total{direction="output"}[5m]) to track output token velocity, which directly correlates to cost.
{"ts":"2026-03-10T14:30:00Z","model":"mistral-small-latest","op":"chat.complete","duration_ms":342,"input_tokens":128,"output_tokens":256,"cost_usd":0.00018,"status":"success","request_id":"req_abc123"} # 2026: 256: 342 = configured value
Ship structured logs to your SIEM for correlation with business metrics.
| Issue | Cause | Solution |
|---|---|---|
| Missing token counts | Streaming response not aggregated | Accumulate tokens from stream chunks |
| Cost drift from actual bill | Pricing table outdated | Update PRICING map when Mistral changes rates |
| Alert storm on 429s | Rate limit hit during burst | Tune alert threshold, add request queuing |
| High cardinality metrics | Too many label combinations | Avoid per-request-id labels |
Basic usage: Apply mistral observability to a standard project setup with default configuration options.
Advanced scenario: Customize mistral observability for production environments with multiple constraints and team-specific requirements.