From groq-pack
Diagnoses and fixes Groq API errors like 401 authentication and 429 rate limits using curl diagnostics, rate limit headers, and Bash/TypeScript solutions.
How this skill is triggered — by the user, by Claude, or both
Slash command
/groq-pack:groq-common-errorsThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Comprehensive reference for Groq API error codes, their root causes, and proven fixes. Groq returns standard HTTP status codes with structured error bodies and rate-limit headers.
Comprehensive reference for Groq API error codes, their root causes, and proven fixes. Groq returns standard HTTP status codes with structured error bodies and rate-limit headers.
{
"error": {
"message": "Rate limit reached for model `llama-3.3-70b-versatile`...",
"type": "tokens",
"code": "rate_limit_exceeded"
}
}
set -euo pipefail
# 1. Verify API key is valid
curl -s https://api.groq.com/openai/v1/models \
-H "Authorization: Bearer $GROQ_API_KEY" | jq '.data | length'
# 2. Check specific model availability
curl -s https://api.groq.com/openai/v1/models \
-H "Authorization: Bearer $GROQ_API_KEY" | jq '.data[].id' | sort
# 3. Test a minimal completion
curl -s https://api.groq.com/openai/v1/chat/completions \
-H "Authorization: Bearer $GROQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"llama-3.1-8b-instant","messages":[{"role":"user","content":"ping"}],"max_tokens":5}' | jq .
Authentication error: Invalid API key provided
Causes: Key missing, revoked, or malformed. Fix:
# Verify key is set and starts with gsk_
echo "${GROQ_API_KEY:0:4}" # Should print "gsk_"
# Test key directly
curl -s -o /dev/null -w "%{http_code}" \
https://api.groq.com/openai/v1/models \
-H "Authorization: Bearer $GROQ_API_KEY"
# Should return 200
Rate limit reached for model `llama-3.3-70b-versatile` in organization `org_xxx`
on tokens per minute (TPM): Limit 6000, Used 5800, Requested 500.
Causes: RPM (requests/min), TPM (tokens/min), or RPD (requests/day) limit hit.
Rate limit headers returned:
| Header | Description |
|---|---|
retry-after | Seconds to wait before retrying |
x-ratelimit-limit-requests | Max requests per window |
x-ratelimit-limit-tokens | Max tokens per window |
x-ratelimit-remaining-requests | Requests remaining |
x-ratelimit-remaining-tokens | Tokens remaining |
x-ratelimit-reset-requests | When request limit resets |
x-ratelimit-reset-tokens | When token limit resets |
Fix:
import Groq from "groq-sdk";
async function handleRateLimit<T>(fn: () => Promise<T>): Promise<T> {
try {
return await fn();
} catch (err) {
if (err instanceof Groq.APIError && err.status === 429) {
const retryAfter = parseInt(err.headers?.["retry-after"] || "10");
console.warn(`Rate limited. Waiting ${retryAfter}s...`);
await new Promise((r) => setTimeout(r, retryAfter * 1000));
return fn(); // Single retry
}
throw err;
}
}
Invalid parameter: model 'mixtral-8x7b-32768' is not available
Causes: Deprecated model ID, invalid parameters, or schema violation.
Common deprecated model IDs:
| Deprecated | Replacement |
|---|---|
mixtral-8x7b-32768 | llama-3.1-8b-instant or llama-3.3-70b-versatile |
gemma2-9b-it | llama-3.1-8b-instant |
llama-3.1-70b-versatile | llama-3.3-70b-versatile |
Fix: Check current models at console.groq.com/docs/models or call GET /openai/v1/models.
Maximum context length is 131072 tokens. However, your messages resulted in 140000 tokens.
Fix: Reduce prompt size or split into smaller requests. All current Llama models have 128K context.
Internal server error / Service temporarily unavailable
Causes: Groq infrastructure issue, model overloaded. Fix: Retry with backoff, fall back to a different model, check status.groq.com.
TypeScript:
import Groq from "groq-sdk";
try {
await groq.chat.completions.create({ /* ... */ });
} catch (err) {
if (err instanceof Groq.APIError) {
console.error(`Status: ${err.status}, Message: ${err.message}`);
} else if (err instanceof Groq.APIConnectionError) {
console.error("Network error:", err.message);
} else if (err instanceof Groq.RateLimitError) {
console.error("Rate limited:", err.message);
} else if (err instanceof Groq.AuthenticationError) {
console.error("Auth failed:", err.message);
}
}
Python:
from groq import Groq, APIError, RateLimitError, AuthenticationError
try:
client.chat.completions.create(...)
except RateLimitError as e:
print(f"Rate limited: {e.message}")
except AuthenticationError as e:
print(f"Auth error: {e.message}")
except APIError as e:
print(f"API error {e.status_code}: {e.message}")
x-request-id header)groq-debug-bundle skill to gather diagnosticsFor comprehensive debugging, see groq-debug-bundle.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin groq-packTriages Groq API outages via status checks, auth verification, model tests, rate limits; guides mitigation, fallback providers, and postmortems.
Diagnoses and fixes Anthropic Claude API errors by HTTP status code, covering invalid requests, authentication, permissions, rate limits, and overloads.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.