From lindy-pack
Optimizes Lindy AI agent execution speed, reliability, and cost by profiling tasks, right-sizing models, consolidating LLM steps, and fixing bottlenecks.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin lindy-packThis skill is limited to using the following tools:
Lindy agents execute as multi-step workflows where each step (LLM call, action
Optimizes Lindy AI costs by auditing agent credit usage, right-sizing models like Claude Sonnet to Gemini Flash, and consolidating redundant agents.
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
Lindy agents execute as multi-step workflows where each step (LLM call, action execution, API call, condition evaluation) adds latency and credit cost. Optimization targets: fewer steps, smaller models, faster actions, tighter prompts.
In the Tasks tab, open a completed task and review:
Common bottlenecks:
| Bottleneck | Symptom | Fix |
|---|---|---|
| Large model on simple task | High credit cost, slow | Switch to Gemini Flash |
| Too many LLM steps | Long total duration | Consolidate into fewer steps |
| Agent Step with many skills | Unpredictable path | Reduce to 2-4 focused skills |
| Knowledge Base over-querying | Multiple KB searches | Increase Max Results per query |
| Sequential when parallel possible | Unnecessary waiting | Use loop with Max Concurrent > 1 |
The single biggest performance lever. Match model to task complexity:
| Task | Recommended Model | Speed | Credits |
|---|---|---|---|
| Route email to category | Gemini Flash | Fast | ~1 |
| Extract fields from text | GPT-4o-mini | Fast | ~2 |
| Draft short response | Claude Sonnet | Medium | ~3 |
| Complex multi-step analysis | GPT-4 / Claude Opus | Slow | ~10 |
| Simple phone call | Gemini Flash | Fast | ~20/min |
| Complex phone conversation | Claude Sonnet | Medium | ~20/min |
Rule of thumb: Start with the smallest model. Only upgrade if output quality is insufficient. Most classification and routing tasks work fine with Gemini Flash.
Before (3 LLM calls, ~9 credits):
Step 1: Classify email (LLM)
Step 2: Extract key entities (LLM)
Step 3: Generate response (LLM)
After (1 LLM call, ~3 credits):
Step 1: Classify, extract entities, and generate response (single LLM prompt)
Consolidated prompt:
Analyze this email and return JSON with:
1. "classification": one of [billing, technical, general]
2. "entities": {customer_name, product, issue_type}
3. "draft_response": professional reply under 150 words
Email: {{email_received.body}}
Replace AI-powered fields with Set Manually mode when values are predictable:
| Field | Instead of AI Prompt | Use Set Manually |
|---|---|---|
| Slack channel | "Post to the support channel" | #support-triage |
| Email subject | "Create an appropriate subject" | [Ticket] {{email_received.subject}} |
| Sheet column | "Determine the right column" | Column A |
Each Set Manually field saves one LLM inference (~1 credit).
Search for the customer's specific product issue.
Focus on: {{extracted_entities.product}} {{extracted_entities.issue_type}}
Not: "Search for relevant information" (too vague, wastes results)Prevent wasted runs with precise trigger filters:
Before: Email Received (all emails) → 200 runs/day → 600 credits
After: Email Received (label: "support" AND NOT from: "noreply@")
→ 30 runs/day → 90 credits (85% savings)
Agent Steps (autonomous mode) are powerful but expensive — the agent may take unpredictable paths and use more actions than a deterministic workflow.
Use Agent Steps when: Next steps are genuinely uncertain (complex research, multi-source investigation, adaptive problem-solving)
Use deterministic actions when: Steps are predictable (classify -> route -> respond)
When using Agent Steps:
For batch processing, configure loops for efficiency:
| Agent Type | Expected Duration | Expected Credits |
|---|---|---|
| Simple router (1 LLM + 1 action) | 2-5 seconds | 1-2 |
| Email triage (classify + respond) | 5-15 seconds | 3-5 |
| Research agent (search + analyze) | 15-60 seconds | 5-15 |
| Multi-agent pipeline | 30-120 seconds | 10-30 |
| Phone call | Real-time | ~20/min |
| Issue | Cause | Solution |
|---|---|---|
| Agent timeout | Too many sequential steps | Consolidate steps, reduce skill count |
| High credit burn | Large model + many steps | Downgrade model, merge LLM calls |
| Inconsistent output | Agent Step choosing different paths | Switch to deterministic workflow |
| KB search slow | Large knowledge base | Reduce fuzziness, increase specificity |
| Loop runs too long | High max cycles, low concurrency | Increase Max Concurrent, lower Max Cycles |
Proceed to lindy-cost-tuning for budget optimization.