From together-pack
Provides guidance on Together AI rate limits for inference, fine-tuning, and deployment using OpenAI-compatible API. Covers 429 errors, backoff, and common issues.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin together-packThis skill is limited to using the following tools:
Guidance for rate limits with Together AI inference and fine-tuning API.
Guides performance tuning for Together AI inference, fine-tuning, and model deployment using OpenAI-compatible API. Covers errors, models, batch inference, and resources.
Curates permanent free-tier LLM APIs from providers like Groq, Cohere, Mistral, Gemini with models, rate limits, regions, and OpenAI SDK-compatible endpoints. Useful for zero-cost AI inference integration.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Share bugs, ideas, or general feedback.
Guidance for rate limits with Together AI inference and fine-tuning API.
base_url = 'https://api.together.xyz/v1'together Python SDK or any OpenAI client library| Error | Cause | Solution |
|---|---|---|
401 Unauthorized | Invalid API key | Check at api.together.xyz |
Model not found | Wrong model ID | Use client.models.list() |
429 Rate limit | Too many requests | Implement backoff |
500 Server error | Model overloaded | Retry with backoff |
See related Together AI skills for more patterns.