Help us improve
Share bugs, ideas, or general feedback.
From cache-money
Schedules lightweight pings to keep Anthropic prompt cache warm in Claude Code sessions, preventing expiry during peak hours and optimizing token costs.
npx claudepluginhub florianbuetow/claude-code --plugin cache-moneyHow this skill is triggered — by the user, by Claude, or both
Slash command
/cache-money:cache-moneyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Keep the Anthropic prompt cache warm during Claude Code sessions — especially during peak hours when usage limits are tighter — by scheduling lightweight pings tuned to your cache TTL.
Tracks tokens, analyzes caching behavior, identifies bottlenecks in tool usage, and estimates costs to optimize Claude Code session performance and efficiency.
Implements in-memory and Redis caching for OpenRouter LLM API responses on deterministic requests to reduce costs and latency. Use for repeat queries or RAG systems.
Caches LLM prompts and responses using Anthropic prompt caching, Redis response caching, and Cache Augmented Generation (CAG).
Share bugs, ideas, or general feedback.
Keep the Anthropic prompt cache warm during Claude Code sessions — especially during peak hours when usage limits are tighter — by scheduling lightweight pings tuned to your cache TTL.
Claude Code sends the full conversation context with every API call. Anthropic caches this prefix server-side and serves subsequent calls from cache at ~10% of the base input price. But the cache expires after a TTL period of inactivity:
| TTL Tier | Duration | Cache Write Cost | Who Gets It |
|---|---|---|---|
| Default | 5 minutes | 1.25x base input | All plans (no ttl field specified) |
| Extended | 1 hour | 2x base input | Max-tier plans (server-side in Claude Code), or explicit ttl: "1h" via API |
Cache reads cost 0.1x base input regardless of TTL tier — that's the 90% saving.
If a session sits idle past its TTL, the next call pays full cache-write price for the entire context — up to 1M tokens. During peak hours (weekdays 5:00 AM – 11:00 AM PT), Anthropic's rolling session limits are consumed faster, making every cache miss doubly expensive: higher rebuild cost plus faster quota burn.
For detailed technical background, consult references/cache-mechanics.md.
Determine the current time and day of week in Pacific Time (PT) using:
TZ=America/Los_Angeles date "+%A %H:%M PT"
Classify the current window:
| Condition | Status |
|---|---|
| Weekday, 5:00 AM – 11:00 AM PT | Peak hours active — cache warming strongly recommended |
| Weekday, 4:47 AM – 4:59 AM PT | Peak approaching — pre-warming recommended |
| Weekend or outside peak | Off-peak — cache warming optional, still saves on idle sessions longer than the TTL |
Report the status to the user in one line before proceeding.
Determine which TTL tier is active. Ask the user:
Are you on a Max-tier plan (which enables 1-hour cache TTL in Claude Code), or the default (5-minute cache TTL)?
Use the answer to set the ping interval:
| TTL Tier | Cache Duration | Ping Interval | Safety Margin |
|---|---|---|---|
| Extended (1h) | 60 minutes | 55 minutes | 5 minutes |
| Default (5min) | 5 minutes | 4 minutes | 1 minute |
If the user is unsure, default to the 5-minute TTL (4-minute ping interval) — it's safe for all plans and the overhead is minimal.
Invoke the /loop skill to schedule the recurring ping at the determined interval:
For 1-hour TTL (Max-tier):
Skill tool: skill="loop", args="55m Cache ping. Reply with only: ok"
For 5-minute TTL (default):
Skill tool: skill="loop", args="4m Cache ping. Reply with only: ok"
Each ping triggers one lightweight API call that renews the cached prompt prefix. The response is minimal — just "ok" — so token consumption per ping is negligible.
After starting the loop, report:
| Timezone | Peak Start | Peak End |
|---|---|---|
| PT (Pacific) | 5:00 AM | 11:00 AM |
| MT (Mountain) | 6:00 AM | 12:00 PM |
| CT (Central) | 7:00 AM | 1:00 PM |
| ET (Eastern) | 8:00 AM | 2:00 PM |
| GMT / UTC | 1:00 PM | 7:00 PM |
| CET (Central Europe) | 2:00 PM | 8:00 PM |
| IST (India) | 6:30 PM | 12:30 AM |
| JST (Japan) | 10:00 PM | 4:00 AM (+1) |
| AEST (Sydney) | 11:00 PM | 5:00 AM (+1) |