System designHardsd-g370

Subject Ab testing mlLevel Senior–Staff~45 minCommon in Distributed systems interviewsIndustries Technology, Software development

Question

Design a contextual-bandit system to choose, per user impression, which of many promotional offers to show, where a fixed A/B test wastes traffic on losing offers and offers churn weekly so you can never 'finish' an experiment. The reward (did the user convert) is delayed by hours-to-days. Walk through the explore/exploit setup, how you serve a policy at low latency, and how delayed and biased rewards complicate learning compared to a clean A/B test.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.