Question
Design a cache-invalidation system for a read-heavy service (100:1 read/write) backed by a primary DB and a large distributed cache (Redis) in front, serving 1M reads/sec. The hard requirement: when a row changes in the DB, every cached copy of derived data depending on it across the fleet must be invalidated quickly and reliably, so users never see permanently-stale data after an update. Cache entries are also derived/aggregated (one DB row affects many cache keys). How do you propagate invalidations correctly without thundering-herd cache stampedes on hot keys?
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.