Code Room
System designHardsd-g208
Subject FailoverLevel Senior–Staff~45 minCommon in Reliability & on-call interviewsIndustries Technology, Software development

Question

Design automatic failover for the primary database of an e-commerce order service with async cross-AZ replicas and one async cross-region replica. The business wants RPO near zero for orders (no lost paid orders) and RTO under 60 seconds, but cannot afford the latency of synchronous cross-region replication on every write. Reconcile these goals: how do you do failover, and what do you tell the business is actually achievable?

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.