Code Room
System designHardsd-g170
Subject Anomaly detectionLevel Senior–Staff~45 minCommon in Distributed systems interviewsIndustries Technology

Question

Design a system that, when a top-line metric (e.g. checkout success rate) drops, automatically localizes which dimension is responsible — a specific region, app version, device type, payment provider, or their combination — across a metric with 200 dimensions and millions of dimension-value combinations. Engineers currently slice dashboards by hand for 30 minutes during incidents. The system should surface the responsible slice within ~1 minute of the drop. Design the detection trigger, the dimensional search, and how you avoid spurious explanations.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.