Code Room
System designMediumsd-g460
Subject Multi regionLevel Mid–Senior~40 minCommon in Databases & SQL · Reliability & on-call · Distributed systems interviewsIndustries Technology, Software development

Question

Design the region strategy for a real-time messaging/presence service where each connected client holds a long-lived WebSocket to a gateway in its nearest region, and the gateway holds in-memory session state (subscriptions, presence, unacked messages). Two requirements collide: clients want the nearest gateway for low latency, but when a gateway or whole region fails, clients must reconnect and resume their session (recent messages, presence) with minimal loss and within a couple seconds — without every gateway synchronously replicating every keystroke globally. Design connection routing, where session state lives, and the reconnect/failover path. Be explicit about what state can be lost vs must survive.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.