Code Room
System designMedium
Question
Design dynamic service discovery + config distribution for a microservices fleet of ~3,000 instances across regions. When an instance starts it must register so callers can find it; when it dies or fails health checks it must be removed quickly so traffic stops; and operators must be able to push a config change (e.g., a feature flag or a traffic weight) and have all instances converge within seconds. How do you build the coordination layer and avoid it becoming a single point of failure or a thundering-herd hotspot?
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.