Code Room
System designHardsd-g312
Subject Load balancingLevel Senior–Staff~45 minCommon in Databases & SQL · Networking & APIs · Algorithms & data structures interviewsIndustries Technology

Question

Design the load-balancing layer for a distributed cache fleet (say 500 cache servers) where you want request affinity so the same key lands on the same server (to maximize cache hit rate) but you must avoid hot servers when a few keys go viral. Plain consistent hashing pins a viral key to one overloaded node. Design a scheme that keeps affinity for the long tail but spreads load for hot keys, handles servers joining/leaving with minimal key remapping, and explain how you detect and react to a node approaching saturation in real time.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.