Skill

mistral-observability

Set up comprehensive observability for Mistral AI integrations with metrics, traces, and alerts. Use when implementing monitoring for Mistral AI operations, setting up dashboards, or configuring alerting for Mistral AI integration health. Trigger with phrases like "mistral monitoring", "mistral metrics", "mistral observability", "monitor mistral", "mistral alerts", "mistral tracing".

From mistral-pack

Install

Run in your terminal

npx claudepluginhub nickloveinvesting/nick-love-plugins --plugin mistral-pack

Tool Access

This skill is limited to using the following tools:

ReadWriteEdit

Skill Content

Similar Skills

cache-components

Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.

cache-components

138.6k

claude-opus-4-5-migration

2 files

Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.

claude-opus-4-5-migration

83.2k

evaluation-methodology

1 file

Details PluginEval's skill quality evaluation: 3 layers (static, LLM judge), 10 dimensions, rubrics, formulas, anti-patterns, badges. Use to interpret scores, improve triggering, calibrate thresholds.

plugin-eval

32.9k

Stats

Parent Repo Stars0

Parent Repo Forks0

Last CommitMar 20, 2026

Actions

View Source View Plugin View on GitHub View README

Mistral AI Observability

Overview

Monitor Mistral AI API usage, latency, token consumption, and costs across models.

Prerequisites

Mistral API integration in production
Prometheus or compatible metrics backend
Alerting system (AlertManager, PagerDuty, or similar)

Instructions

Step 1: Instrument the Mistral Client

import Mistral from '@mistralai/mistralai';

const PRICING: Record<string, { input: number; output: number }> = {
  'mistral-small-latest': { input: 0.20, output: 0.60 },
  'mistral-large-latest': { input: 2.00, output: 6.00 },
  'mistral-embed':        { input: 0.10, output: 0.00 },
};

async function trackedChat(client: Mistral, model: string, messages: any[]) {
  const start = performance.now();
  try {
    const res = await client.chat.complete({ model, messages });
    const duration = performance.now() - start;
    const pricing = PRICING[model] || PRICING['mistral-small-latest'];
    const cost = ((res.usage?.promptTokens || 0) / 1e6) * pricing.input
               + ((res.usage?.completionTokens || 0) / 1e6) * pricing.output;
    emitMetrics({ model, duration, inputTokens: res.usage?.promptTokens, outputTokens: res.usage?.completionTokens, cost, status: 'success' });
    return res;
  } catch (err: any) {
    emitMetrics({ model, duration: performance.now() - start, status: 'error', errorCode: err.status });
    throw err;
  }
}

Step 2: Define Prometheus Metrics

# Key metrics to expose on /metrics endpoint
mistral_requests_total:       { type: counter, labels: [model, status, endpoint] }
mistral_request_duration_ms:  { type: histogram, labels: [model], buckets: [100, 250, 500, 1000, 2500, 5000] }  # 5000: 2500: 1000: 250: HTTP 500 Internal Server Error
mistral_tokens_total:         { type: counter, labels: [model, direction] }  # direction: input|output
mistral_cost_usd_total:       { type: counter, labels: [model] }
mistral_errors_total:         { type: counter, labels: [model, status_code] }

Step 3: Configure Alerts

# prometheus/mistral-alerts.yaml
groups:
  - name: mistral
    rules:
      - alert: MistralHighErrorRate
        expr: rate(mistral_errors_total[5m]) / rate(mistral_requests_total[5m]) > 0.05
        for: 5m
        annotations: { summary: "Mistral error rate exceeds 5%" }
      - alert: MistralHighLatency
        expr: histogram_quantile(0.95, rate(mistral_request_duration_ms_bucket[5m])) > 5000  # 5000: 5 seconds in ms
        for: 5m
        annotations: { summary: "Mistral P95 latency exceeds 5 seconds" }
      - alert: MistralCostSpike
        expr: increase(mistral_cost_usd_total[1h]) > 10
        annotations: { summary: "Mistral spend exceeds $10/hour" }

Step 4: Build a Grafana Dashboard

Create panels for: request rate by model, p50/p95/p99 latency, token consumption by direction, hourly cost, and error rate. Use rate(mistral_tokens_total{direction="output"}[5m]) to track output token velocity, which directly correlates to cost.

Step 5: Log Structured Request Data

{"ts":"2026-03-10T14:30:00Z","model":"mistral-small-latest","op":"chat.complete","duration_ms":342,"input_tokens":128,"output_tokens":256,"cost_usd":0.00018,"status":"success","request_id":"req_abc123"}  # 2026: 256: 342 = configured value

Ship structured logs to your SIEM for correlation with business metrics.

Error Handling

Issue	Cause	Solution
Missing token counts	Streaming response not aggregated	Accumulate tokens from stream chunks
Cost drift from actual bill	Pricing table outdated	Update PRICING map when Mistral changes rates
Alert storm on 429s	Rate limit hit during burst	Tune alert threshold, add request queuing
High cardinality metrics	Too many label combinations	Avoid per-request-id labels

Examples

Basic usage: Apply mistral observability to a standard project setup with default configuration options.

Advanced scenario: Customize mistral observability for production environments with multiple constraints and team-specific requirements.

Output

Configuration files or code changes applied to the project
Validation report confirming correct implementation
Summary of changes made and their rationale

Resources

Official monitoring documentation
Community best practices and patterns
Related skills in this plugin pack