Recommend sharding, caching strategies, and read-replication patterns for Cloudflare architectures. Use this skill when preparing for growth, hitting limits, or optimizing for high traffic.
/plugin marketplace add littlebearapps/cloudflare-engineer/plugin install cloudflare-engineer@littlebearapps-cloudflareThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Strategies for scaling Cloudflare architectures beyond default limits while maintaining cost efficiency.
| Bottleneck | Symptom | Solution |
|---|---|---|
| D1 read latency | >50ms queries | Add KV cache layer |
| D1 write throughput | Queue backlog | Batch writes, add queue buffer |
| D1 storage | Approaching 10GB | Archive to R2, partition tables |
| KV read latency | Cache misses | Key prefixing, predictable keys |
| KV write rate | 1 write/sec/key limit | Shard keys, batch writes |
| R2 throughput | Slow uploads | Presigned URLs, multipart |
| Worker memory | 128MB limit | Streaming, chunked processing |
| Worker CPU | 30s timeout | Queues, Workflows, DO |
| Subrequests | 1000/request limit | Service Bindings RPC |
| Queue throughput | Consumer lag | Increase concurrency, batch size |
Request → Edge Cache → KV Cache → D1 → Origin
(Tiered) (Global) (Primary)
async function getWithCache<T>(
kv: KVNamespace,
db: D1Database,
key: string,
query: () => Promise<T>,
ttl: number = 3600
): Promise<T> {
// Try cache first
const cached = await kv.get(key, 'json');
if (cached !== null) {
return cached as T;
}
// Cache miss - fetch from D1
const fresh = await query();
// Write to cache (non-blocking)
kv.put(key, JSON.stringify(fresh), { expirationTtl: ttl });
return fresh;
}
// Pattern 1: TTL-based (simple, eventual consistency)
await kv.put(key, value, { expirationTtl: 300 }); // 5 min
// Pattern 2: Version-based (immediate, more complex)
const version = await kv.get('cache:version');
const key = `data:${id}:v${version}`;
// Invalidate by incrementing version
await kv.put('cache:version', String(Number(version) + 1));
// Pattern 3: Tag-based (flexible, requires cleanup)
await kv.put(`user:${userId}:profile`, data);
await kv.put(`user:${userId}:settings`, settings);
// Invalidate all user data
const keys = await kv.list({ prefix: `user:${userId}:` });
for (const key of keys.keys) {
await kv.delete(key.name);
}
Enable in Worker for static-like responses:
// Cache API for fine-grained control
const cache = caches.default;
app.get('/api/products/:id', async (c) => {
const cacheKey = new Request(c.req.url);
// Check cache
const cached = await cache.match(cacheKey);
if (cached) {
return cached;
}
// Fetch fresh data
const product = await getProduct(c.env.DB, c.req.param('id'));
const response = c.json(product);
response.headers.set('Cache-Control', 's-maxage=300');
// Store in cache
c.executionCtx.waitUntil(cache.put(cacheKey, response.clone()));
return response;
});
When hitting 1 write/sec/key limit:
// Problem: High-frequency counter
await kv.put('page:views', views); // Limited to 1/sec
// Solution: Shard across multiple keys
const SHARD_COUNT = 10;
async function incrementCounter(kv: KVNamespace, key: string) {
const shard = Math.floor(Math.random() * SHARD_COUNT);
const shardKey = `${key}:shard:${shard}`;
const current = Number(await kv.get(shardKey)) || 0;
await kv.put(shardKey, String(current + 1));
}
async function getCounter(kv: KVNamespace, key: string): Promise<number> {
let total = 0;
for (let i = 0; i < SHARD_COUNT; i++) {
const value = await kv.get(`${key}:shard:${i}`);
total += Number(value) || 0;
}
return total;
}
For high-volume time-series data:
-- Partition by month
CREATE TABLE events_2025_01 (
id TEXT PRIMARY KEY,
timestamp TEXT NOT NULL,
data TEXT
);
CREATE TABLE events_2025_02 (
id TEXT PRIMARY KEY,
timestamp TEXT NOT NULL,
data TEXT
);
-- Query router in code
function getEventsTable(date: Date): string {
const year = date.getFullYear();
const month = String(date.getMonth() + 1).padStart(2, '0');
return `events_${year}_${month}`;
}
For multi-tenant applications:
// Tenant-specific D1 databases
interface Bindings {
DB_TENANT_A: D1Database;
DB_TENANT_B: D1Database;
// Or use Hyperdrive for external Postgres
}
function getDbForTenant(env: Bindings, tenantId: string): D1Database {
const dbMapping: Record<string, D1Database> = {
'tenant-a': env.DB_TENANT_A,
'tenant-b': env.DB_TENANT_B,
};
return dbMapping[tenantId] ?? env.DB_DEFAULT;
}
D1 automatically creates read replicas. Optimize access:
// Enable Smart Placement in wrangler.jsonc
{
"placement": { "mode": "smart" }
}
// Worker runs near data, reducing latency
For global coordination with regional caching:
// Durable Object for region-local state
export class RegionalCache {
private state: DurableObjectState;
private cache: Map<string, { value: unknown; expires: number }>;
constructor(state: DurableObjectState) {
this.state = state;
this.cache = new Map();
}
async fetch(request: Request): Promise<Response> {
const url = new URL(request.url);
const key = url.searchParams.get('key');
if (request.method === 'GET' && key) {
const cached = this.cache.get(key);
if (cached && cached.expires > Date.now()) {
return Response.json({ value: cached.value, source: 'cache' });
}
return Response.json({ value: null, source: 'miss' });
}
if (request.method === 'PUT' && key) {
const { value, ttl } = await request.json();
this.cache.set(key, {
value,
expires: Date.now() + (ttl * 1000),
});
return Response.json({ success: true });
}
return Response.json({ error: 'Invalid request' }, { status: 400 });
}
}
For data that can tolerate staleness:
interface CacheEntry<T> {
data: T;
cachedAt: number;
staleAfter: number;
expireAfter: number;
}
async function getWithStaleWhileRevalidate<T>(
kv: KVNamespace,
key: string,
fetcher: () => Promise<T>,
options: {
staleAfter: number; // Serve stale, revalidate in background
expireAfter: number; // Force fresh fetch
}
): Promise<T> {
const cached = await kv.get<CacheEntry<T>>(key, 'json');
const now = Date.now();
if (cached) {
// Fresh - return immediately
if (now < cached.staleAfter) {
return cached.data;
}
// Stale but not expired - return stale, revalidate async
if (now < cached.expireAfter) {
// Background revalidation
kv.put(key, JSON.stringify(await buildCacheEntry(fetcher, options)));
return cached.data;
}
}
// Expired or missing - must fetch fresh
const entry = await buildCacheEntry(fetcher, options);
await kv.put(key, JSON.stringify(entry));
return entry.data;
}
async function buildCacheEntry<T>(
fetcher: () => Promise<T>,
options: { staleAfter: number; expireAfter: number }
): Promise<CacheEntry<T>> {
const now = Date.now();
return {
data: await fetcher(),
cachedAt: now,
staleAfter: now + options.staleAfter,
expireAfter: now + options.expireAfter,
};
}
{
"queues": {
"consumers": [
{
"queue": "events",
"max_batch_size": 100, // Max messages per invocation
"max_concurrency": 20, // Parallel consumer instances
"max_retries": 1,
"dead_letter_queue": "events-dlq"
}
]
}
}
export default {
async queue(batch: MessageBatch, env: Bindings) {
// Group messages for efficient D1 batching
const byType = new Map<string, unknown[]>();
for (const msg of batch.messages) {
const type = msg.body.type;
if (!byType.has(type)) byType.set(type, []);
byType.get(type)!.push(msg.body.payload);
}
// Process each type as a batch
for (const [type, payloads] of byType) {
await processBatch(type, payloads, env);
}
// Ack all at once
batch.ackAll();
},
};
async function processBatch(
type: string,
payloads: unknown[],
env: Bindings
) {
// Batch insert to D1 (≤1000 at a time)
const BATCH_SIZE = 1000;
for (let i = 0; i < payloads.length; i += BATCH_SIZE) {
const chunk = payloads.slice(i, i + BATCH_SIZE);
await insertBatch(env.DB, type, chunk);
}
}
// Problem: Loading entire file into memory
const data = await r2.get(key);
const json = await data.json(); // May exceed 128MB
// Solution: Stream processing
app.get('/export/:key', async (c) => {
const object = await c.env.R2.get(c.req.param('key'));
if (!object) return c.json({ error: 'Not found' }, 404);
return new Response(object.body, {
headers: {
'Content-Type': object.httpMetadata?.contentType ?? 'application/octet-stream',
'Content-Length': String(object.size),
},
});
});
// For files >50MB, process in chunks with checkpointing
export class ChunkedProcessor {
private state: DurableObjectState;
async processFile(r2Key: string, chunkSize: number = 1024 * 1024) {
// Get checkpoint
let offset = (await this.state.storage.get<number>('offset')) ?? 0;
const object = await this.env.R2.get(r2Key, {
range: { offset, length: chunkSize },
});
if (!object) {
// Processing complete
await this.state.storage.delete('offset');
return { complete: true };
}
// Process chunk
await this.processChunk(await object.arrayBuffer());
// Save checkpoint
await this.state.storage.put('offset', offset + chunkSize);
// Schedule next chunk via alarm
await this.state.storage.setAlarm(Date.now() + 100);
return { complete: false, offset: offset + chunkSize };
}
}
| Scaling Strategy | Additional Cost | When to Use |
|---|---|---|
| KV caching | $0.50/M reads | D1 read heavy |
| Key sharding | More KV reads | >1 write/sec/key |
| Time partitioning | None (same D1) | >10GB data |
| Tiered Cache | None (CDN) | Cacheable responses |
| DO coordination | CPU time | Global state |
| Queue scaling | Per message | High throughput |
| Pattern | Problem | Solution |
|---|---|---|
| Cache everything | KV costs add up | Cache hot data only |
| Shard too early | Complexity without benefit | Monitor first |
| Ignore TTLs | Stale data | Set appropriate TTLs |
| Skip DLQ | Lost messages | Always add DLQ |
| Over-replicate | Cost multiplication | Right-size replication |
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.