Code Room
System designMediumsd-g737
Subject Model driftLevel Mid–Senior~35 minCommon in ML systems interviewsIndustries Technology

Question

Design an automated retraining system that keeps ~80 production models fresh as data drifts, without a human babysitting each one. The system should decide when to retrain a model (on a schedule, on drift signal, or on a data-volume trigger), retrain reproducibly, validate the candidate against the incumbent before promotion, and roll back if the new model underperforms. Models have different cadences (some daily, some monthly) and different label-availability delays. The goal is fresh models with a strong safety gate, so a bad automated retrain can never silently ship and degrade production.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.