System designHardsd-g646

Subject Load balancingLevel Senior–Staff~45 minCommon in Databases & SQL · Networking & APIs · Algorithms & data structures interviewsIndustries Technology

Question

Design a layer-7 (HTTP/2) load balancer that distributes 500k requests/sec across a backend pool of 2,000 stateful cache servers where requests for the same key should prefer the same backend to maximize cache hit rate. Targets: p99 added latency under 3ms, even load spread (no backend over ~1.2x the mean), and when backends are added/removed only a small fraction of keys should remap. Backends fail constantly at this scale, so detect a dead backend within a couple of seconds and stop sending it traffic without a stampede onto its neighbors. Walk through the balancing algorithm, the health-check model, and the main trade-off.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.