From harness-claude
Guides HTTP caching with Cache-Control directives, ETag/Last-Modified validation, and Vary headers. Useful for API endpoints, CDN debugging, cache invalidation, and PR reviews.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> HTTP CACHING IS A FIRST-CLASS PERFORMANCE MECHANISM BUILT INTO THE HTTP PROTOCOL — CORRECT CACHE-CONTROL DIRECTIVES AND ETAG GENERATION CAN ELIMINATE REDUNDANT NETWORK ROUND-TRIPS AND ORIGIN LOAD BY ORDERS OF MAGNITUDE. MISCONFIGURED CACHING EITHER SERVES STALE DATA OR PREVENTS CACHING ENTIRELY, WASTING INFRASTRUCTURE AND LATENCY.
Implements API response caching with Redis, Memcached, cache-aside patterns, TTL, tag-based invalidation, HTTP headers, and stale-while-revalidate for performance optimization.
Guides designing CDN architectures, caching strategies, and global content distribution. Covers cache hierarchies, origin shielding, invalidation, and edge optimization.
Advises on cache strategies, invalidation patterns, and distributed caching. Detects Redis/Memcached/in-memory usage, analyzes access patterns, designs layers, troubleshoots stale data and stampedes.
Share bugs, ideas, or general feedback.
HTTP CACHING IS A FIRST-CLASS PERFORMANCE MECHANISM BUILT INTO THE HTTP PROTOCOL — CORRECT CACHE-CONTROL DIRECTIVES AND ETAG GENERATION CAN ELIMINATE REDUNDANT NETWORK ROUND-TRIPS AND ORIGIN LOAD BY ORDERS OF MAGNITUDE. MISCONFIGURED CACHING EITHER SERVES STALE DATA OR PREVENTS CACHING ENTIRELY, WASTING INFRASTRUCTURE AND LATENCY.
max-age, s-maxage, and no-cache for a given endpointCache-Control: no-store to every responseVary header strategy for an endpoint that uses content negotiationCache-Control Directives — The primary mechanism for controlling caching behavior. Applied to both requests and responses. The most important response directives:
max-age=N — The response is fresh for N seconds from the response time. Applies to all caches (browser and shared/CDN).s-maxage=N — Overrides max-age for shared caches (CDNs, proxies) only. Browser cache still uses max-age.no-cache — The response may be stored but must be revalidated with the origin before each use. Not "do not cache" — it forces revalidation.no-store — The response must not be stored anywhere. Use only for truly sensitive data (session tokens, PII). This is the only directive that truly prevents caching.private — The response may only be cached by the browser (not CDNs or proxies). Use for user-specific responses.public — The response may be cached by any cache, including shared CDN caches, even when the request included an Authorization header.immutable — Tells the browser the response will never change while fresh; skip revalidation entirely (useful for fingerprinted assets).stale-while-revalidate=N — Serve stale content for up to N seconds while revalidating in the background. Improves perceived latency.ETag (Entity Tag) — A validator representing the current version of a resource. Generated by the server; included in responses; sent back by clients in conditional requests. Strong ETags ("abc123") guarantee byte-for-byte identity. Weak ETags (W/"abc123") indicate semantic equivalence (content equivalent but possibly different encoding). ETags enable cache revalidation — the cache sends the stored ETag and receives either 304 Not Modified (cache is current) or 200 OK with a new body.
HTTP/1.1 200 OK
ETag: "d41d8cd98f00b204e9800998ecf8427e"
Cache-Control: max-age=300
Last-Modified Header — A weaker validator using timestamps. Less precise than ETags (1-second granularity) but simpler to generate. When both ETag and Last-Modified are present, prefer ETag for revalidation. Used in conjunction with If-Modified-Since and If-None-Match conditional requests (see api-conditional-requests).
Vary Header — Tells caches to store separate cache entries for responses that differ by specified request headers. Critical for content-negotiated responses. Without Vary, a CDN may serve a cached JSON response to a client requesting XML.
Vary: Accept, Accept-Encoding, Authorization
Caution: Vary: Authorization effectively disables CDN caching because most CDNs will not cache responses that vary by Authorization. For authenticated but shared resources (e.g., public API data requiring auth), use s-maxage and Cache-Control: public to allow CDN caching regardless of the Authorization header, but only when the data is truly the same for all authenticated users.
CDN vs. Browser Cache — Browser caches are private (per-user). CDN/proxy caches are shared. s-maxage controls shared cache TTL. private directive prevents CDN caching entirely. When designing headers, decide independently: "Should this be cached by browsers?" and "Should this be cached by CDNs?"
Cache Invalidation — When a resource changes, caches serving the old version must be invalidated. HTTP provides no push-based invalidation — caches expire naturally or revalidate. CDN purge APIs (Cloudflare, Fastly, CloudFront) provide explicit invalidation. For immutable content (fingerprinted assets), rely on URL changes instead of invalidation. For mutable API resources, keep max-age low or use stale-while-revalidate to bound staleness.
GitHub's API demonstrates a sophisticated cache strategy across different resource types:
Public repository data (CDN-cacheable, short TTL):
GET /repos/torvalds/linux
Authorization: Bearer ghp_...
HTTP/1.1 200 OK
Cache-Control: private, max-age=60
ETag: "abc123def456"
Last-Modified: Thu, 10 Apr 2026 12:00:00 GMT
Vary: Accept, Authorization
Content-Type: application/vnd.github.v3+json
The private directive prevents CDN caching for authenticated responses (private overrides s-maxage per RFC 9111 §5.2.2.7, so s-maxage must not be combined with private). The max-age=60 allows browser caching for 60 seconds.
Conditional revalidation (client uses stored ETag):
GET /repos/torvalds/linux
If-None-Match: "abc123def456"
Authorization: Bearer ghp_...
HTTP/1.1 304 Not Modified
ETag: "abc123def456"
Cache-Control: private, max-age=60
304 response: no body transmitted. The browser uses its cached copy. This reduces data transfer for polling clients.
Static release asset (immutable, long TTL):
GET /releases/download/v6.8/linux-6.8.tar.gz
HTTP/1.1 200 OK
Cache-Control: public, max-age=31536000, immutable
ETag: "sha256:e3b0c44298fc1c149afb..."
Content-Type: application/octet-stream
Immutable assets with 1-year TTL: never revalidated while fresh. URL changes when content changes.
Write operation followed by cache invalidation signal:
PATCH /repos/torvalds/linux
Content-Type: application/json
{ "description": "Updated description" }
HTTP/1.1 200 OK
Cache-Control: no-cache
ETag: "newetag789"
no-cache on the write response tells any cache that stored this write's response must revalidate. The ETag changes, causing downstream caches to revalidate on next access.
Using no-store everywhere as a "safe default." no-store prevents all caching including browser back-button cache, which degrades user experience. Most API responses are not sensitive enough to warrant no-store. Use no-cache for data that should always be fresh but not secret, private, max-age=N for user-specific data, and no-store only for responses containing session tokens, passwords, or regulated PII.
Omitting the Vary header on negotiated responses. A CDN without Vary: Accept will serve a cached JSON body to a client requesting CSV. A CDN without Vary: Accept-Encoding will serve a cached gzip body to a client that cannot decompress gzip. Always set Vary to include every request header used in selecting the response.
Generating weak or non-unique ETags. ETags generated from Last-Modified timestamps with second-level granularity will return stale 304 Not Modified responses if the resource changes more than once per second. Use content hashes (MD5, SHA-256 of the response body) for strong ETags. Avoid ETags based on database updated_at timestamps alone.
Using long TTLs on mutable resources without ETags. Cache-Control: max-age=3600 on an order status endpoint will serve a 1-hour-stale response for an order that ships 5 minutes after the first fetch. Either keep max-age short (60-300s) for frequently changing resources, add ETags for revalidation, or use stale-while-revalidate with background refresh.
Is the response user-specific (contains PII or auth-scoped data)?
Yes → Cache-Control: private, max-age=<short>
No → Is the resource immutable (fingerprinted URL)?
Yes → Cache-Control: public, max-age=31536000, immutable
No → Is freshness critical (financial data, session state)?
Yes → Cache-Control: no-cache (ETag-based revalidation)
No → Cache-Control: public, s-maxage=<CDN TTL>, max-age=<browser TTL>
The stale-while-revalidate directive allows a cache to serve a stale response immediately while revalidating asynchronously. This hides revalidation latency from users:
Cache-Control: max-age=60, stale-while-revalidate=300
The resource is fresh for 60 seconds. Between 60 and 360 seconds, the cache serves the stale response while fetching a fresh copy in the background. After 360 seconds, the cache must revalidate before serving.
Stripe's public API documentation and static asset infrastructure uses aggressive caching with URL-based versioning for immutable assets and stale-while-revalidate for API reference pages. By shifting from max-age=0 (essentially no caching) to max-age=300, stale-while-revalidate=600 on their API reference pages, Stripe reduced origin load by 78% and cut average response time from 340ms to 18ms for cached responses. The key insight: no-cache was being misused as "do not cache" — replacing it with max-age=300 with ETag revalidation preserved freshness guarantees while enabling CDN acceleration.
Cache-Control directives: use s-maxage for CDN TTL, max-age for browser TTL, private for user-specific responses, no-store only for sensitive secrets.Last-Modified as a fallback.Vary to include all request headers used in response selection (Accept, Accept-Encoding, Accept-Language).harness validate to confirm skill files are well-formed.Cache-Control header with appropriate max-age or s-maxage values.Vary headers include all request headers used in content negotiation.no-store is used only for responses containing credentials or regulated PII, not as a default.max-age.