Cache eviction

A small shelf with room for only a few things — when something new arrives, something old has to leave. The policy is the rule for choosing who goes.

The idea

A cache is fast memory with a fixed capacity. It can hold, say, four entries. As long as a key you ask for is already there, you get a hit — instant. When you ask for a key that isn't there, that's a miss: you fetch it from the slow store and put it in the cache so next time is fast.

But the cache is small. When it's full and a miss brings in a new key, one existing entry must be evicted to make room. The eviction policy is the rule for picking the victim. LRU evicts the entry that hasn't been touched for the longest; LFU evicts the entry that has been touched the fewest times.

See it work

Choose a policy, then play through the access stream.

How it works

LRU (least recently used). Keep entries in recency order. Every get or put moves that key to the "most recently used" end. To evict, drop the entry at the "least recently used" end. The classic structure is a hash map + doubly linked list: the map gives O(1) lookup, and the list lets you unlink a node and re-stitch it at the head in O(1). Python's OrderedDict packages exactly this.

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.cap = capacity
        self.data = OrderedDict()       # oldest .. newest

    def get(self, key):
        if key not in self.data:
            return -1
        self.data.move_to_end(key)      # mark most-recently-used
        return self.data[key]

    def put(self, key, value):
        if key in self.data:
            self.data.move_to_end(key)
        self.data[key] = value
        if len(self.data) > self.cap:
            self.data.popitem(last=False)  # evict least-recently-used

LFU (least frequently used). Track a hit count per key and evict the smallest count (break ties by recency). The naive version is O(n) to find the minimum, but the textbook O(1) design keeps frequency buckets: a map from each frequency to a doubly linked list of keys at that frequency, plus a running min_freq. A hit moves the key from bucket f to bucket f+1; eviction pops from the min_freq bucket — all O(1), at the cost of more bookkeeping.

Cost

Operation	Complexity	Note
`get` (LRU)	`O(1)`	Map lookup + move to MRU end
`put` (LRU)	`O(1)`	Insert + evict LRU node, both constant
Space (LRU)	`O(capacity)`	One map entry + one list node per key
`get` / `put` (LFU)	`O(1)`	With the frequency-bucket trick; naive is `O(n)` per evict
Space (LFU)	`O(capacity)`	Same keys, plus per-frequency bucket lists + a count per key

The metric that actually matters in production is hit rate — hits ÷ total accesses. A 1% drop in hit rate can multiply load on the backing store. The policy you pick is a bet about which past accesses predict future ones.

Watch out for

LRU thrashing on scans. A one-time sequential scan of cold keys evicts the entire hot working set, even though those scanned keys are never touched again. This is why scan-resistant designs exist — LRU-K (evict by the K-th-from-last access), 2Q, and ARC, which balance recency against frequency.
LFU cache pollution. A key that was hammered last week keeps a huge count and squats in the cache long after it went cold, blocking fresher keys. Real LFU needs aging / decay (e.g. halve all counts periodically, or window the counts) so old popularity fades.
Treating get as read-only. If a cache hit doesn't bump recency (or frequency), you don't actually have LRU (or LFU) — you've built FIFO by accident. The access itself must update the metadata.
Unbounded metadata. Per-key timestamps, counters, and list nodes are real memory. Track them only for resident keys; if you keep stats for evicted keys "just in case," the bookkeeping grows without bound and defeats the point of a fixed cache.
Cache stampede. When one very popular key expires, every request misses at once and slams the backing store together. Separate problem from eviction, but worth a guard (single-flight / request coalescing, or a soft TTL that one request refreshes while others serve stale).

Worked example

Capacity 4, stream A B C D A B E C ..., starting empty.

A B C D — four misses, each fills an empty slot. Cache is now full: [A B C D].
A — hit. No eviction; A becomes most-recently-used. Recency oldest→newest is now B C D A.
B — hit. B jumps to newest. Recency order: C D A B.
E — miss, cache full. Under LRU the least-recently-used entry is C, so we evict C and insert E. Cache: [E D A B].
C — miss again (we just evicted it). LRU now drops D.

Now contrast that E access under LFU. Counts after A B C D A B are A:2, B:2, C:1, D:1. E misses with the cache full, and LFU evicts the least frequent — a tie between C and D at count 1, broken by recency, so it evicts C here too. But had the stream warmed C instead of A, LFU would protect C and evict D while LRU would still drop whoever was touched longest ago. The policies diverge whenever frequency and recency disagree.

Check yourself

The cache holds [A, B, C, D]. Its recency order, oldest → newest, is C, A, D, B. A new key E arrives and misses. Under LRU, which entry is evicted?

Same cache, but now counts are A:5, B:1, C:1, D:1 (A was historically very hot, now cold). E misses. Under LFU with no decay, which entry is safe from eviction this round?