On-callMediumoc-g661

Subject Model serving ab variant regressionLevel Mid–Senior~30 minCommon in ML systems interviewsIndustries Technology

Question

You're running a 50/50 online A/B between ranking model v-control and v-treatment on the homepage feed. At 11:00 the experiment guardrail dashboard escalates: the treatment arm's downstream 'session revenue per user' is down 9% versus control and the gap is statistically significant and widening over two hours, while top-line CTR on the feed is actually slightly UP in treatment. Both arms return 200s, latency is equal, no errors. Dashboards: treatment surfaces more clickbait-style low-value items high in the ranking; control's revenue-weighted positions are unchanged. How do you triage and decide?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Learn the concepts

Diagram & narrate the incident

Loading whiteboard…

Run or narrate your approach, then ask the coach.