Why 100 cache misses don't equal 100 database hits.
A Content Delivery Network (CDN) has thousands of "Edge" nodes spread across the world (e.g. one in Tokyo, one in London). If a viral video is requested in Tokyo, the Tokyo Edge node fetches it from your Origin Server. But what if 100 different Edge nodes all get a cache miss simultaneously? To protect your Origin, CDNs use a "Tiered" hierarchy, placing a massive "Regional Edge Cache" between the local nodes and your server.
When multiple Edge nodes experience a cache miss, they forward the request to the Regional Tier. The Regional Tier recognizes they are all asking for the same file, "collapses" them into a single request, and only asks your Origin Server ONCE.
# Inside the Regional Tier Cache Node
inflight_requests = {}
def get_file(url):
# 1. Check local cache
if cache.has(url): return cache.get(url)
# 2. REQUEST COLLAPSING
# If we are already fetching this URL from Origin, just wait!
if url in inflight_requests:
return inflight_requests[url].wait_for_completion()
# 3. We are the first! Create a promise and fetch from Origin.
promise = Promise()
inflight_requests[url] = promise
data = fetch_from_origin(url)
cache.put(url, data)
promise.resolve(data) # Wake up all waiting Edge nodes
del inflight_requests[url] # Cleanup
return data
Adding a Regional Tier increases latency on the very first cache miss (because the request hops: User -> Edge -> Regional -> Origin). But it drastically reduces the load (Cost) on the Origin server, keeping it online during traffic spikes.
Vary: User-Agent, the CDN cannot collapse requests! It must treat an iPhone request and a Chrome request as completely different files, bypassing the hierarchy protections./image.png?v=1 and /image.png?v=2 are cached separately. Ensure you strip irrelevant query parameters (like analytics UTM tags) before the cache key is computed.