Code Room
System designHardsd-g593
Subject Model drift monitoringLevel Senior–Staff~50 minCommon in ML systems · Reliability & on-call interviewsIndustries Technology

Question

Design a model-monitoring and drift-detection system for an organization running hundreds of production models, where labels for many models arrive late or never. It must detect feature drift, prediction drift, and performance degradation, distinguish a real model problem from an upstream data outage, and alert the owning team with enough context to act, all without adding meaningful latency to the serving path. Cover what you log, the detection methods, and how you avoid alert fatigue.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.