Code Room
System designMedium
Question
Design the load-balancing policy for a stateless API tier of ~80 servers where requests have highly variable cost (some calls take 2ms, some take 2s) and servers occasionally degrade (GC pauses, a slow disk) without fully failing. Round-robin currently sends traffic to limping servers and overloads them with expensive requests. Design the balancing algorithm, health checking, and how you detect and eject a 'gray-failing' server.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.