When a single expired cache item takes down your whole backend.
If you cache the homepage of a busy news site for 60 seconds, everything runs fast. But at exactly second 61, the cache expires. If 5,000 users request the homepage at that exact moment, the cache sees a "miss" for all 5,000 users and forwards all of them to the database to fetch the fresh homepage. The database instantly melts. This is the Thundering Herd problem.
To fix this, we allow the cache to serve slightly stale data while it updates in the background. The very first request after expiration triggers a background fetch, but still receives the stale cache. All other requests also receive the stale cache. No one is blocked, and the database only gets exactly ONE request.
# The HTTP Header solution: Stale-While-Revalidate
# Cache-Control: max-age=60, stale-while-revalidate=120
def get_homepage(request_time):
item = cache.get("homepage")
# 1. Fresh? Serve it. (0 - 60s)
if request_time < item.expires_at:
return item.data
# 2. Stale, but within the revalidate window? (60s - 180s)
if request_time < item.expires_at + 120:
if not item.is_updating_in_background:
item.is_updating_in_background = True
fire_async_background_task(update_cache) # DB gets 1 request!
# Immediately return the STALE data to the user! No waiting!
return item.data
# 3. Completely expired (> 180s). Must block and wait.
return fetch_from_database_synchronously()
Users might see data that is a few seconds out of date. You are trading strict consistency (everyone sees the absolute latest data instantly) for extreme availability and low latency.