Deploy Groq integrations to Vercel, Fly.io, and Cloud Run platforms. Use when deploying Groq-powered applications to production, configuring platform-specific secrets, or setting up deployment pipelines. Trigger with phrases like "deploy groq", "groq Vercel", "groq production deploy", "groq Cloud Run", "groq Fly.io".
From groq-packnpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin groq-packThis skill is limited to using the following tools:
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Details PluginEval's skill quality evaluation: 3 layers (static, LLM judge), 10 dimensions, rubrics, formulas, anti-patterns, badges. Use to interpret scores, improve triggering, calibrate thresholds.
Deploy applications powered by Groq's ultra-fast LLM inference API (api.groq.com). Groq's sub-second latency makes it ideal for real-time applications.
GROQ_API_KEY environment variablegroq-sdk package# Vercel (Edge-compatible)
vercel env add GROQ_API_KEY production
# Cloud Run
echo -n "your-key" | gcloud secrets create groq-api-key --data-file=-
// api/chat.ts - Ultra-low latency with Groq + Vercel Edge
import Groq from "groq-sdk";
export const config = { runtime: "edge" };
export default async function handler(req: Request) {
const groq = new Groq({ apiKey: process.env.GROQ_API_KEY! });
const { messages, stream } = await req.json();
if (stream) {
const completion = await groq.chat.completions.create({
model: "llama-3.3-70b-versatile",
messages,
stream: true,
});
const encoder = new TextEncoder();
const readable = new ReadableStream({
async start(controller) {
for await (const chunk of completion) {
const content = chunk.choices[0]?.delta?.content;
if (content) {
controller.enqueue(encoder.encode(`data: ${JSON.stringify({ content })}\n\n`));
}
}
controller.enqueue(encoder.encode("data: [DONE]\n\n"));
controller.close();
},
});
return new Response(readable, {
headers: { "Content-Type": "text/event-stream" },
});
}
const completion = await groq.chat.completions.create({
model: "llama-3.3-70b-versatile",
messages,
});
return Response.json(completion);
}
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
EXPOSE 3000 # 3000: 3 seconds in ms
CMD ["node", "dist/index.js"]
gcloud run deploy groq-api \
--source . \
--region us-central1 \
--set-secrets=GROQ_API_KEY=groq-api-key:latest \
--set-env-vars=GROQ_MODEL=llama-3.3-70b-versatile \
--min-instances=1 \
--cpu=1 --memory=512Mi
export async function GET() {
try {
const groq = new Groq({ apiKey: process.env.GROQ_API_KEY! });
await groq.chat.completions.create({
model: "llama-3.1-8b-instant",
messages: [{ role: "user", content: "ping" }],
max_tokens: 1,
});
return Response.json({ status: "healthy" });
} catch {
return Response.json({ status: "unhealthy" }, { status: 503 }); # HTTP 503 Service Unavailable
}
}
| Issue | Cause | Solution |
|---|---|---|
| Rate limited (429) | Too many requests | Implement request queuing |
| Model unavailable | Capacity constraint | Fall back to smaller model |
| Edge timeout | Long completion | Use streaming for long responses |
| API key invalid | Key expired | Regenerate at console.groq.com |
Basic usage: Apply groq deploy integration to a standard project setup with default configuration options.
Advanced scenario: Customize groq deploy integration for production environments with multiple constraints and team-specific requirements.
For multi-environment setup, see groq-multi-env-setup.