Add retry, timeout, and circuit breaker patterns at the workflow level. Business functions stay clean.
Adds retry, timeout, and circuit breaker patterns at the workflow level. Use it to make external API calls, database operations, and network requests resilient to transient failures without cluttering business logic.
/plugin marketplace add jagreehal/jagreehal-claude-skills/plugin install jagreehal-claude-skills@jagreehal-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Resilience is a composition concern, not a business logic concern. Add retry/timeout at the workflow level, not inside functions.
Workflows
-> step.retry() and step.withTimeout()
-> (resilience here)
Business Functions
-> fn(args, deps): Result<T, E>
-> (no retry logic here)
Infrastructure
-> pg, redis, http
-> (just transport)
NEVER add retry logic inside business functions:
// WRONG - Retry inside function
async function getUser(args, deps) {
let attempts = 0;
while (attempts < 3) {
try {
return await deps.db.findUser(args.userId);
} catch { attempts++; }
}
}
// CORRECT - Clean function, workflow handles retry
async function getUser(args, deps) {
const user = await deps.db.findUser(args.userId);
return user ? ok(user) : err('NOT_FOUND');
}
// Workflow adds resilience
const result = await workflow(async (step) => {
const user = await step.retry(
() => getUser({ userId }, deps),
{ attempts: 3, backoff: 'exponential' }
);
return user;
});
Retry at ONE level only. Multiple layers create retry explosion:
3 (API) × 3 (Service) × 3 (DB Client) = 27 attempts!
This DDoS's your own infrastructure.
| Error Type | Retry? | Why |
|---|---|---|
TIMEOUT | Yes | Transient |
CONNECTION_ERROR | Yes | Network hiccup |
RATE_LIMITED | Yes | Wait and retry |
NOT_FOUND | NO | Resource doesn't exist |
UNAUTHORIZED | NO | Credentials wrong |
VALIDATION_FAILED | NO | Input invalid |
const data = await step.retry(
() => fetchFromApi(),
{
attempts: 3,
retryOn: (error) => {
const retryable = ['TIMEOUT', 'CONNECTION_ERROR', 'RATE_LIMITED'];
return retryable.includes(error);
},
}
);
// DANGEROUS - May double-charge
await step.retry(() => chargeCard(amount), { attempts: 3 });
// SAFE - Read is idempotent
await step.retry(() => getUser(userId), { attempts: 3 });
// SAFE - With idempotency key
await step.retry(
() => chargeCard(amount, { idempotencyKey }),
{ attempts: 3 }
);
Never let operations hang indefinitely:
const data = await step.withTimeout(
() => slowOperation(),
{ ms: 2000 }
);
Prevents thundering herd when multiple instances retry:
// Without jitter - all instances retry at same time
// With jitter - spread out, infrastructure can recover
step.retry(() => fetchData(), {
attempts: 3,
backoff: 'exponential',
jitter: true, // ALWAYS enable in production
});
Each attempt gets its own timeout:
const data = await step.retry(
() => fetchData(),
{
attempts: 3,
timeout: { ms: 2000 }, // 2s per attempt
}
);
// Total max time: 3 × 2s = 6s
| Operation | Attempts | Backoff | Initial Delay | Timeout |
|---|---|---|---|---|
| DB read | 3 | exponential | 50ms | 5s |
| DB write | 1 | - | - | 10s |
| HTTP API | 3 | exponential | 100ms | 30s |
| Cache | 2 | fixed | 10ms | 500ms |
import { createWorkflow } from '@jagreehal/workflow';
// Clean business function
async function getUser(args, deps): AsyncResult<User, 'NOT_FOUND' | 'DB_ERROR'> {
try {
const user = await deps.db.findUser(args.userId);
return user ? ok(user) : err('NOT_FOUND');
} catch {
return err('DB_ERROR');
}
}
// Workflow adds resilience
const loadUser = createWorkflow({ getUser });
const result = await loadUser(async (step) => {
const user = await step.retry(
() => getUser({ userId }, deps),
{
attempts: 3,
backoff: 'exponential',
initialDelay: 100,
maxDelay: 2000,
jitter: true,
timeout: { ms: 5000 },
}
);
return user;
});
Sometimes you need to retry a multi-step operation. Use step.retry() to wrap the entire sequence:
const syncUserToProvider = createWorkflow({ findUser, syncUser, markSynced });
const result = await syncUserToProvider(async (step) => {
// Retry the whole operation
const user = await step.retry(
async () => {
const user = await step(() => findUser({ userId }, deps));
await step(() => syncUser({ user }, deps)); // Must be idempotent!
await step(() => markSynced({ userId }, deps));
return user;
},
{
attempts: 2,
backoff: 'exponential',
}
);
return user;
});
Important: The entire sequence must be idempotent. If syncUser is called twice, it should have the same effect as calling it once.
When a service is down, stop hammering it. Circuit breakers prevent cascade failures:
// Circuit breaker states
// CLOSED: Normal operation, requests go through
// OPEN: Service down, fail fast without trying
// HALF_OPEN: Testing if service recovered
Circuit breakers are outside the scope of step.retry(), but consider libraries like opossum or cockatiel for production systems where dependencies fail frequently.
When to use circuit breakers:
Don't use for:
Use helpers to detect and handle timeouts:
import { isStepTimeoutError, getStepTimeoutMeta } from '@jagreehal/workflow';
const result = await workflow(async (step) => {
const data = await step.withTimeout(
() => slowOperation(),
{ ms: 5000 }
);
return data;
});
if (!result.ok && isStepTimeoutError(result.error)) {
const meta = getStepTimeoutMeta(result.error);
deps.logger.warn('Operation timed out', {
timeoutMs: meta?.timeoutMs,
attempt: meta?.attempt,
});
}
| Failure Type | Where to Retry |
|---|---|
| Transport/network | Workflow level |
| Idempotent reads | Workflow level |
| Non-idempotent writes | NEVER (or with idempotency key) |
| Multi-step operation | Workflow level (if idempotent) |
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.