Read-through cache

Keep a hot copy of slow data nearby, so most reads never have to touch the database.

The idea

A database read is slow — disk seeks, network hops, query parsing. If the same few keys get read over and over, paying that cost every single time is wasteful. So we put a small, fast store (memory) in front of the database.

On every read we check the cache first. A hit returns instantly and never wakes the database. A miss takes the slow trip to the database, then fills the cache on the way back — so the next read of that key is a hit. As popular keys warm up, the slow path fires less and less, and average latency falls toward the cache's speed.

See it work

Press play to send a stream of reads through the cache.

How it works

In a read-through cache the cache sits inline: the application asks the cache for a key, and the cache library (or loader) is responsible for fetching from the database on a miss and storing the result. The app never talks to the database directly — it just calls cache.get(key).

Contrast that with cache-aside (the more common DIY pattern): the application checks the cache itself, and on a miss it queries the database and writes the value back into the cache by hand. Same hit/miss/fill cycle — the difference is just who owns the fill logic.

# read-through get(key): the cache owns the miss path
def get(key):
    value = cache.get(key)
    if value is not None:        # HIT — fast, DB untouched
        return value
    value = db.get(key)          # MISS — slow trip to the database
    cache.set(key, value, ttl)   # fill the cache for next time
    return value

Writes need a plan too. Write-through updates the cache and the database together on every write, so the cache is never stale (but writes pay the database cost). Write-back / write-behind updates the cache immediately and flushes to the database asynchronously — fast writes, but you risk losing buffered data on a crash. A TTL on each entry caps how stale a value can get: after it expires, the next read misses and re-fetches fresh data.

Cost / signals

Signal	Value	What it tells you
Hit latency	`~1ms`	Served from memory; the database is never touched.
Miss latency	`~51ms`	Cache probe + the slow DB fetch (~1 + ~50) + a tiny fill.
Hit rate	`hits / reads`	The key health signal. Higher means more reads stay fast.
Effective avg latency	`h·1 + (1−h)·50`	Weighted by hit rate `h`; drops as `h` climbs.
Cold start	low `h` at first	An empty cache misses on every new key until it warms up.

With a 1ms hit and a 50ms miss, a 90% hit rate gives 0.9·1 + 0.1·50 = 5.9ms average — versus 50ms with no cache. The win is almost entirely about how often you hit.

Watch out for

Stale reads. When the underlying row changes, the cached copy is now wrong. You need a TTL so it eventually expires, or explicit invalidation that deletes/updates the cache entry on every write.
Cache stampede (thundering herd). When a hot key expires, many concurrent reads all miss at once and hammer the database together. Coalesce them with single-flight / request locking so only one request refills while the rest wait.
Cold start. Right after a deploy or cache flush, the cache is empty, so hit rate — and latency — are at their worst. Consider warming popular keys in advance.
Cache penetration. Reads for keys that don't exist miss every time and always fall through to the database. Cache the negative result (a short-TTL "not found" marker) so repeated misses don't keep hitting the store.
Write-path inconsistency. If writes update the database but forget the cache, readers keep seeing the old value. This is exactly why write-through or invalidation on write matters.

Worked example

Start with an empty cache and run the read stream A B A C A. Hit = 1ms, miss = 50ms.

Read	Result	Latency	Hit rate so far	Avg latency so far
`A`	miss → fill A	50ms	0 / 1 = 0%	50.0ms
`B`	miss → fill B	50ms	0 / 2 = 0%	50.0ms
`A`	hit	1ms	1 / 3 = 33%	33.7ms
`C`	miss → fill C	50ms	1 / 4 = 25%	37.8ms
`A`	hit	1ms	2 / 5 = 40%	30.4ms

Three misses and two hits. Total time = 50+50+1+50+1 = 152ms, so the average is 152 / 5 = 30.4ms. Notice the average falling each time A repeats — every hit is a 50ms trip we got to skip. Keep replaying A and the average keeps sliding toward 1ms.

Check yourself

Out of 5 reads you got 3 hits and 2 misses (hit = 1ms, miss = 50ms). Roughly what's the average latency per read?

A single very hot key expires and, in the same instant, thousands of in-flight reads all miss. What's the failure mode, and the fix?