Read-through cache

Keep a hot copy of slow data nearby, so most reads never have to touch the database.

The idea

A database read is slow — disk seeks, network hops, query parsing. If the same few keys get read over and over, paying that cost every single time is wasteful. So we put a small, fast store (memory) in front of the database.

On every read we check the cache first. A hit returns instantly and never wakes the database. A miss takes the slow trip to the database, then fills the cache on the way back — so the next read of that key is a hit. As popular keys warm up, the slow path fires less and less, and average latency falls toward the cache's speed.

See it work

CLIENT read CACHE (~1ms) DATABASE (~50ms)
Press play to send a stream of reads through the cache.

How it works

In a read-through cache the cache sits inline: the application asks the cache for a key, and the cache library (or loader) is responsible for fetching from the database on a miss and storing the result. The app never talks to the database directly — it just calls cache.get(key).

Contrast that with cache-aside (the more common DIY pattern): the application checks the cache itself, and on a miss it queries the database and writes the value back into the cache by hand. Same hit/miss/fill cycle — the difference is just who owns the fill logic.

# read-through get(key): the cache owns the miss path
def get(key):
    value = cache.get(key)
    if value is not None:        # HIT — fast, DB untouched
        return value
    value = db.get(key)          # MISS — slow trip to the database
    cache.set(key, value, ttl)   # fill the cache for next time
    return value

Writes need a plan too. Write-through updates the cache and the database together on every write, so the cache is never stale (but writes pay the database cost). Write-back / write-behind updates the cache immediately and flushes to the database asynchronously — fast writes, but you risk losing buffered data on a crash. A TTL on each entry caps how stale a value can get: after it expires, the next read misses and re-fetches fresh data.

Cost / signals

SignalValueWhat it tells you
Hit latency~1msServed from memory; the database is never touched.
Miss latency~51msCache probe + the slow DB fetch (~1 + ~50) + a tiny fill.
Hit ratehits / readsThe key health signal. Higher means more reads stay fast.
Effective avg latencyh·1 + (1−h)·50Weighted by hit rate h; drops as h climbs.
Cold startlow h at firstAn empty cache misses on every new key until it warms up.

With a 1ms hit and a 50ms miss, a 90% hit rate gives 0.9·1 + 0.1·50 = 5.9ms average — versus 50ms with no cache. The win is almost entirely about how often you hit.

Watch out for

Worked example

Start with an empty cache and run the read stream A B A C A. Hit = 1ms, miss = 50ms.

ReadResultLatencyHit rate so farAvg latency so far
Amiss → fill A50ms0 / 1 = 0%50.0ms
Bmiss → fill B50ms0 / 2 = 0%50.0ms
Ahit1ms1 / 3 = 33%33.7ms
Cmiss → fill C50ms1 / 4 = 25%37.8ms
Ahit1ms2 / 5 = 40%30.4ms

Three misses and two hits. Total time = 50+50+1+50+1 = 152ms, so the average is 152 / 5 = 30.4ms. Notice the average falling each time A repeats — every hit is a 50ms trip we got to skip. Keep replaying A and the average keeps sliding toward 1ms.

Check yourself

Out of 5 reads you got 3 hits and 2 misses (hit = 1ms, miss = 50ms). Roughly what's the average latency per read?

A single very hot key expires and, in the same instant, thousands of in-flight reads all miss. What's the failure mode, and the fix?