Code Room
System designHardsd-g585
Subject Training pipelineLevel Senior–Staff~55 minCommon in ML systems interviewsIndustries Technology

Question

Design an end-to-end training pipeline for a click-through-rate model that retrains daily on ~5B labeled impressions, must produce a reproducible, validated model artifact, and feed a daily-refreshed online predictor. Latency to production (data ready → model live) should be under 4 hours, and a bad model must never reach 100% of traffic. Cover data ingestion/labeling, the orchestration, validation gates, and how offline training stays consistent with online serving.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.