Code Room
System designHardsd-g283
Subject Exactly once deliveryLevel Senior–Staff~55 minCommon in Distributed systems interviewsIndustries Technology

Question

Design a stateful stream-processing job that computes real-time per-merchant fraud-risk aggregates (e.g., count + sum of transactions per merchant over sliding 5-minute and 1-hour windows) off a 1M-events/sec transaction stream, emitting an alert when a window crosses a threshold. The aggregates must be correct under failures — a job restart must not double-count nor lose events — and must handle late and out-of-order events (a transaction event arriving 20 s after its actual time). How do you compute exactly-once windowed aggregates that survive crashes and late data?

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.