CDN Cache Hierarchy

Why 100 cache misses don't equal 100 database hits.

The idea

A Content Delivery Network (CDN) has thousands of "Edge" nodes spread across the world (e.g. one in Tokyo, one in London). If a viral video is requested in Tokyo, the Tokyo Edge node fetches it from your Origin Server. But what if 100 different Edge nodes all get a cache miss simultaneously? To protect your Origin, CDNs use a "Tiered" hierarchy, placing a massive "Regional Edge Cache" between the local nodes and your server.

Step 1: Three local Edge nodes get a cache miss for a new video.

How it works (Request Collapsing)

When multiple Edge nodes experience a cache miss, they forward the request to the Regional Tier. The Regional Tier recognizes they are all asking for the same file, "collapses" them into a single request, and only asks your Origin Server ONCE.

# Inside the Regional Tier Cache Node

inflight_requests = {}

def get_file(url):
    # 1. Check local cache
    if cache.has(url): return cache.get(url)
    
    # 2. REQUEST COLLAPSING
    # If we are already fetching this URL from Origin, just wait!
    if url in inflight_requests:
        return inflight_requests[url].wait_for_completion()
        
    # 3. We are the first! Create a promise and fetch from Origin.
    promise = Promise()
    inflight_requests[url] = promise
    
    data = fetch_from_origin(url)
    
    cache.put(url, data)
    promise.resolve(data)       # Wake up all waiting Edge nodes
    del inflight_requests[url]  # Cleanup
    
    return data

Cost

Adding a Regional Tier increases latency on the very first cache miss (because the request hops: User -> Edge -> Regional -> Origin). But it drastically reduces the load (Cost) on the Origin server, keeping it online during traffic spikes.

Watch out for