Code Room
System designHard
Question
Design an experimentation system to compare two search-ranking models quickly and cheaply, where a classic A/B test is too slow because the effect size per query is tiny and you'd need huge traffic to reach significance. The team ships ranker candidates several times a week and wants a fast, low-variance 'is B better than A' signal. Walk through how you'd use interleaving instead of (or before) a full A/B test, how you measure a winner, and how you guard against the failure modes that make interleaving lie.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.