From mistral-pack
Deploys Mistral AI apps to Vercel Edge/Serverless, Docker, and Cloud Run with secret configuration and code examples for production.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin mistral-packThis skill is limited to using the following tools:
Deploy Mistral AI-powered applications to production with secure API key management. Covers Vercel (Edge + Serverless), Docker, Cloud Run, and self-hosted vLLM deployments. All connect to `api.mistral.ai` or your own inference endpoint.
Installs Mistral AI SDK for Node.js/TypeScript and Python, configures API key authentication via env vars/dotenv, and verifies connection with model listing scripts.
Deploys Groq API apps to Vercel Edge, Cloud Run, and Docker. Guides Next.js edge functions, secrets setup, Dockerfiles, and CLI deployments.
Deploys Cohere API v2 apps to Vercel, Fly.io, and Google Cloud Run with secrets management, health checks, and platform configs.
Share bugs, ideas, or general feedback.
Deploy Mistral AI-powered applications to production with secure API key management. Covers Vercel (Edge + Serverless), Docker, Cloud Run, and self-hosted vLLM deployments. All connect to api.mistral.ai or your own inference endpoint.
@mistralai/mistralai SDKset -euo pipefail
# Vercel
vercel env add MISTRAL_API_KEY production
vercel env add MISTRAL_MODEL production # optional: default model
# Cloud Run
echo -n "your-key" | gcloud secrets create mistral-api-key --data-file=-
# Docker
echo "MISTRAL_API_KEY=your-key" > .env.production
echo ".env.production" >> .gitignore
// api/chat.ts — Vercel Edge Function with streaming
import { Mistral } from '@mistralai/mistralai';
export const config = { runtime: 'edge' };
export default async function handler(req: Request) {
const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY! });
const { messages, stream = false } = await req.json();
if (stream) {
const streamResponse = await client.chat.stream({
model: process.env.MISTRAL_MODEL ?? 'mistral-small-latest',
messages,
});
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const event of streamResponse) {
const content = event.data?.choices?.[0]?.delta?.content;
if (content) {
controller.enqueue(encoder.encode(`data: ${JSON.stringify({ content })}\n\n`));
}
}
controller.enqueue(encoder.encode('data: [DONE]\n\n'));
controller.close();
},
});
return new Response(readable, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
},
});
}
const response = await client.chat.complete({
model: process.env.MISTRAL_MODEL ?? 'mistral-small-latest',
messages,
});
return Response.json(response);
}
FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --production=false
COPY . .
RUN npm run build
FROM node:20-slim
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
ENV NODE_ENV=production
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=5s \
CMD curl -sf http://localhost:3000/health || exit 1
CMD ["node", "dist/index.js"]
set -euo pipefail
docker build -t mistral-app .
docker run -d --name mistral-app \
-p 3000:3000 \
-e MISTRAL_API_KEY="$MISTRAL_API_KEY" \
-e MISTRAL_MODEL="mistral-small-latest" \
mistral-app
set -euo pipefail
# Build and push
gcloud builds submit --tag gcr.io/$PROJECT_ID/mistral-app
# Deploy with secret injection
gcloud run deploy mistral-service \
--image gcr.io/$PROJECT_ID/mistral-app \
--region us-central1 \
--platform managed \
--set-secrets=MISTRAL_API_KEY=mistral-api-key:latest \
--set-env-vars=MISTRAL_MODEL=mistral-small-latest \
--min-instances=1 \
--max-instances=10 \
--memory=512Mi \
--timeout=60s
For data sovereignty or latency requirements, self-host open-weight Mistral models:
set -euo pipefail
# Serve Mistral with vLLM (OpenAI-compatible API)
docker run --runtime nvidia --gpus all \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-p 8000:8000 \
-e HF_TOKEN="$HF_TOKEN" \
vllm/vllm-openai:latest \
--model mistralai/Mistral-Small-24B-Instruct-2501 \
--dtype auto \
--api-key "your-local-key"
Point the SDK at your local endpoint:
import { Mistral } from '@mistralai/mistralai';
const client = new Mistral({
apiKey: 'your-local-key',
serverURL: 'http://localhost:8000', // vLLM endpoint
});
import { Mistral } from '@mistralai/mistralai';
export async function GET() {
const start = performance.now();
try {
const client = new Mistral({ apiKey: process.env.MISTRAL_API_KEY! });
await client.models.list();
return Response.json({
status: 'healthy',
provider: 'mistral',
latencyMs: Math.round(performance.now() - start),
});
} catch (error: any) {
return Response.json(
{ status: 'unhealthy', error: error.message },
{ status: 503 },
);
}
}
| Issue | Cause | Solution |
|---|---|---|
| API key not found | Missing env/secret | Verify secret config on platform |
| Function timeout | Long completion | Increase timeout, use streaming |
| Cold start latency | Serverless spin-up | Set min-instances=1 or use edge |
| vLLM OOM | Model too large for GPU | Use quantized model or smaller variant |