From harness-claude
Designs and optimizes CDN architectures: tiered caching, origin shielding, edge compute, cache hit ratio optimization, multi-CDN strategies, geographic routing. For high TTFB, origin overload, low cache hits, global apps.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Design and optimize Content Delivery Network architecture — tiered caching, origin shielding, edge compute patterns, cache hit ratio optimization, multi-CDN strategies, and geographic routing for globally distributed applications.
Guides designing CDN architectures, caching strategies, and global content distribution. Covers cache hierarchies, origin shielding, invalidation, and edge optimization.
Optimizes CDN caching: cache keys, Vary headers, Surrogate-Control, tag-based purging, s-maxage TTLs, ESI. Use for low hit ratios, slow purges, TTL separation, fragmentation.
Guides CDN configuration for media delivery, cache invalidation, signed URLs, edge caching, origin shielding, and secure access in headless CMS using Azure Front Door and CloudFront.
Share bugs, ideas, or general feedback.
Design and optimize Content Delivery Network architecture — tiered caching, origin shielding, edge compute patterns, cache hit ratio optimization, multi-CDN strategies, and geographic routing for globally distributed applications.
Understand CDN architecture. A CDN consists of Points of Presence (PoPs) distributed globally. Each PoP contains edge servers that cache content. When a user requests a resource, DNS (typically anycast) routes them to the nearest PoP.
User Request Flow:
User → DNS (anycast) → Nearest PoP Edge → [Cache HIT] → Response
→ [Cache MISS] → Shield PoP → Origin → Response
Configure tiered caching. Most CDNs support multi-tier caching:
Enable origin shielding to collapse multiple edge misses into a single origin request:
Without shielding: Edge-A miss → Origin
Edge-B miss → Origin
Edge-C miss → Origin (3 origin requests)
With shielding: Edge-A miss → Shield → Origin
Edge-B miss → Shield → (cache hit)
Edge-C miss → Shield → (cache hit) (1 origin request)
Optimize cache hit ratio. Target >95% cache hit ratio for static assets, >80% for dynamic content. Strategies:
Vary header correctly (only vary on headers that actually change the response)Implement edge compute for dynamic content. Edge compute runs application logic at the CDN edge, eliminating origin roundtrips for common operations:
// Cloudflare Worker example: A/B testing at the edge
export default {
async fetch(request) {
const bucket = request.headers.get('cookie')?.includes('ab=B') ? 'B' : 'A';
const url = new URL(request.url);
url.pathname = `/${bucket}${url.pathname}`;
return fetch(url);
},
};
Design cache keys carefully. The cache key determines what constitutes a unique cacheable response. Poor cache key design causes either cache pollution (too many unique entries) or incorrect responses (too few entries).
Good cache key: scheme + host + path + normalized-query
Bad cache key: scheme + host + path + all-headers + cookies
(creates per-user cache entries, ~0% hit rate)
Configure multi-CDN for redundancy. DNS-based routing directs traffic to the best-performing CDN:
Monitor CDN performance. Track these metrics:
CDNs use anycast — the same IP address is advertised from every PoP via BGP. The internet's routing infrastructure directs each user to the geographically nearest PoP. Anycast provides automatic failover: if a PoP goes offline, BGP reconverges and traffic routes to the next nearest PoP. Typical failover time: 10-30 seconds.
After a deploy or cache purge, hit rates temporarily drop. Warming strategies:
Netflix serves 125 million hours of video daily through their Open Connect CDN. Instead of using a traditional CDN, Netflix deploys custom cache appliances directly inside ISP networks. Each appliance stores the most popular content for that region (determined by machine learning prediction). This approach reduces internet transit traffic by 70% and achieves sub-10ms latency for video streams. The key insight: for high-bandwidth content, pushing cache servers into the last-mile network eliminates backbone traversal entirely.
The Guardian improved First Contentful Paint by 1.2 seconds by moving from a single-origin architecture to a CDN-first design with edge-side includes (ESI). Static page shells are cached at the edge with a 60-second TTL. Dynamic components (personalized recommendations, live scores) are fetched client-side or via ESI fragments with separate TTLs. The result: 95% of page views are served entirely from CDN edge with <50ms TTFB, compared to the previous 300-800ms origin response time. Origin requests dropped by 85%.
Caching personalized content without proper Vary headers. If your CDN caches a page with user-specific content (name, cart, recommendations) without varying on the correct identifier, it serves user A's data to user B. Personalized content must either use Vary headers correctly, be excluded from caching, or be fetched client-side after the cached shell loads.
Not configuring origin shielding. Without shielding, a cache expiry causes every edge PoP to simultaneously request from origin. With 200 PoPs, this creates a thundering herd of 200 concurrent origin requests for the same content. Origin shielding collapses these to 1 request.
Short TTLs globally instead of tiered TTLs. Setting Cache-Control: max-age=60 on all content means even static assets (CSS, JS, images) are revalidated every minute. Use content-type-specific TTLs: static assets with content hashes get max-age=31536000, HTML gets max-age=60, stale-while-revalidate=3600.
Ignoring CDN cache key design. Default cache keys often include all query parameters, cookies, and sometimes headers. A URL with analytics parameters (?utm_source=twitter&utm_medium=social) creates a different cache entry than the base URL, reducing hit rates. Strip non-functional query parameters from cache keys.
Using a single CDN region for global traffic. Deploying origin servers in a single region (e.g., us-east-1) and relying solely on CDN edge caching creates a single point of failure and high shield-to-origin latency for distant PoPs. Deploy origin replicas in at least two regions and configure the CDN shield tier to route to the nearest origin.
Purging entire cache zones instead of targeted invalidation. When a single asset changes, purging the entire cache (sometimes called a "zone purge") drops hit rates to zero temporarily and creates a thundering herd against the origin. Use surrogate keys or cache tags to purge only the specific resources that changed. Fastly, Cloudflare, and Akamai all support tag-based purging.
Edge compute is ideal for latency-sensitive, low-complexity operations: URL rewrites, A/B test bucket assignment, geolocation-based routing, bot detection, and request/response header manipulation. Avoid running complex business logic at the edge — database access from edge functions introduces unpredictable latency depending on the PoP-to-database distance, and debugging distributed edge deployments is significantly harder than centralized origin debugging. A good rule of thumb: if the operation needs a database query, keep it at the origin. If it can run with only the request context and a key-value store, it belongs at the edge.
Cache invalidation is often the hardest part of CDN management. Three primary approaches:
For most applications, combining TTL-based expiration with stale-while-revalidate provides the best balance: Cache-Control: max-age=300, stale-while-revalidate=3600 serves cached content for 5 minutes, then serves stale for up to an hour while revalidating in the background.
When using purge-on-publish, always implement rate limiting on purge API calls. A misconfigured deploy pipeline that purges thousands of URLs in a tight loop can overwhelm the CDN's purge infrastructure and cause cascading cache misses across all PoPs.