System designHardsd-g353

Subject Feature storeLevel Senior–Staff~50 minCommon in ML systems · Code quality & review interviewsIndustries Technology, Software development

Question

Design the offline training-data generation layer of a feature store that must guarantee point-in-time correctness for a payment-fraud model. The training set is built from millions of labeled transactions, each with a decision timestamp; for every row you must join in feature values (e.g. 'merchant chargeback rate in trailing 30d', 'cardholder distinct-device count in trailing 7d') exactly as they would have been at that decision instant — never leaking a value computed after the label event. The same feature definitions also serve online at <10ms. Walk through how you store feature history, how the point-in-time join works at scale, and how you keep the offline and online code paths from drifting apart.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.