Execute Groq primary workflow: Core Workflow A. Use when implementing primary use case, building main features, or core integration tasks. Trigger with phrases like "groq main workflow", "primary task with groq".
From groq-packnpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin groq-packThis skill is limited to using the following tools:
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Details PluginEval's skill quality evaluation: 3 layers (static, LLM judge), 10 dimensions, rubrics, formulas, anti-patterns, badges. Use to interpret scores, improve triggering, calibrate thresholds.
Primary money-path workflow for Groq. This is the most common use case. Groq provides ultra-low-latency LLM inference using custom LPU (Language Processing Unit) hardware, enabling token generation speeds that are significantly faster than GPU-based providers. This makes Groq the right choice for latency-sensitive applications such as real-time chat interfaces, voice assistants, and streaming analysis pipelines where response time directly impacts user experience.
groq-install-auth setupAuthenticate with the Groq API and select the target model from the available options (LLaMA, Mixtral, Gemma, or others available on the platform). Configure your default request parameters including temperature, max tokens, and stop sequences. Verify the model is available in your region and that your rate limits accommodate your expected request volume.
// Step 1 implementation
Submit the chat completion or text completion request to Groq. Because of the LPU architecture, first-token latency is exceptionally low, so the streaming experience feels near-instant for end users. Monitor the token-per-second rate in the response metadata to confirm the performance profile matches expectations for your use case.
// Step 2 implementation
Handle the streamed or buffered response appropriately for your application. For interactive use cases, render tokens as they arrive. For batch processing, accumulate the full response before writing results. Log model ID, token usage, and latency metrics for cost attribution and capacity planning.
// Step 3 implementation
| Error | Cause | Solution |
|---|---|---|
| Error 1 | Cause | Solution |
| Error 2 | Cause | Solution |
// Complete workflow example
For secondary workflow, see groq-core-workflow-b.