groq-common-errors | groq-pack

Stats

Actions

Tags

groq-common-errors | groq-pack

Groq Common Errors

Overview

Comprehensive reference for Groq API error codes, their root causes, and proven fixes. Groq returns standard HTTP status codes with structured error bodies and rate-limit headers.

Error Response Format

{
  "error": {
    "message": "Rate limit reached for model `llama-3.3-70b-versatile`...",
    "type": "tokens",
    "code": "rate_limit_exceeded"
  }
}

Quick Diagnostic

set -euo pipefail
# 1. Verify API key is valid
curl -s https://api.groq.com/openai/v1/models \
  -H "Authorization: Bearer $GROQ_API_KEY" | jq '.data | length'

# 2. Check specific model availability
curl -s https://api.groq.com/openai/v1/models \
  -H "Authorization: Bearer $GROQ_API_KEY" | jq '.data[].id' | sort

# 3. Test a minimal completion
curl -s https://api.groq.com/openai/v1/chat/completions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"llama-3.1-8b-instant","messages":[{"role":"user","content":"ping"}],"max_tokens":5}' | jq .

Error Reference

401 — Authentication Error

Authentication error: Invalid API key provided

Causes: Key missing, revoked, or malformed. Fix:

# Verify key is set and starts with gsk_
echo "${GROQ_API_KEY:0:4}"  # Should print "gsk_"

# Test key directly
curl -s -o /dev/null -w "%{http_code}" \
  https://api.groq.com/openai/v1/models \
  -H "Authorization: Bearer $GROQ_API_KEY"
# Should return 200

429 — Rate Limit Exceeded

Rate limit reached for model `llama-3.3-70b-versatile` in organization `org_xxx`
on tokens per minute (TPM): Limit 6000, Used 5800, Requested 500.

Causes: RPM (requests/min), TPM (tokens/min), or RPD (requests/day) limit hit.

Rate limit headers returned:

Header	Description
`retry-after`	Seconds to wait before retrying
`x-ratelimit-limit-requests`	Max requests per window
`x-ratelimit-limit-tokens`	Max tokens per window
`x-ratelimit-remaining-requests`	Requests remaining
`x-ratelimit-remaining-tokens`	Tokens remaining
`x-ratelimit-reset-requests`	When request limit resets
`x-ratelimit-reset-tokens`	When token limit resets

Fix:

import Groq from "groq-sdk";

async function handleRateLimit<T>(fn: () => Promise<T>): Promise<T> {
  try {
    return await fn();
  } catch (err) {
    if (err instanceof Groq.APIError && err.status === 429) {
      const retryAfter = parseInt(err.headers?.["retry-after"] || "10");
      console.warn(`Rate limited. Waiting ${retryAfter}s...`);
      await new Promise((r) => setTimeout(r, retryAfter * 1000));
      return fn(); // Single retry
    }
    throw err;
  }
}

400 — Bad Request

Invalid parameter: model 'mixtral-8x7b-32768' is not available

Causes: Deprecated model ID, invalid parameters, or schema violation.

Common deprecated model IDs:

Deprecated	Replacement
`mixtral-8x7b-32768`	`llama-3.1-8b-instant` or `llama-3.3-70b-versatile`
`gemma2-9b-it`	`llama-3.1-8b-instant`
`llama-3.1-70b-versatile`	`llama-3.3-70b-versatile`

Fix: Check current models at console.groq.com/docs/models or call GET /openai/v1/models.

413 — Request Too Large

Maximum context length is 131072 tokens. However, your messages resulted in 140000 tokens.

Fix: Reduce prompt size or split into smaller requests. All current Llama models have 128K context.

500 / 503 — Server Errors

Internal server error / Service temporarily unavailable

Causes: Groq infrastructure issue, model overloaded. Fix: Retry with backoff, fall back to a different model, check status.groq.com.

SDK-Specific Errors

TypeScript:

import Groq from "groq-sdk";

try {
  await groq.chat.completions.create({ /* ... */ });
} catch (err) {
  if (err instanceof Groq.APIError) {
    console.error(`Status: ${err.status}, Message: ${err.message}`);
  } else if (err instanceof Groq.APIConnectionError) {
    console.error("Network error:", err.message);
  } else if (err instanceof Groq.RateLimitError) {
    console.error("Rate limited:", err.message);
  } else if (err instanceof Groq.AuthenticationError) {
    console.error("Auth failed:", err.message);
  }
}

Python:

from groq import Groq, APIError, RateLimitError, AuthenticationError

try:
    client.chat.completions.create(...)
except RateLimitError as e:
    print(f"Rate limited: {e.message}")
except AuthenticationError as e:
    print(f"Auth error: {e.message}")
except APIError as e:
    print(f"API error {e.status_code}: {e.message}")

Escalation Path

Check status.groq.com for ongoing incidents
Collect request ID from error response (x-request-id header)
Run groq-debug-bundle skill to gather diagnostics
Contact Groq support with request ID and debug bundle

Resources

Next Steps

For comprehensive debugging, see groq-debug-bundle.