Code Room
System designHardsd-g426
Subject Exactly once deliveryLevel Senior–Staff~40 minCommon in Networking & APIs · Distributed systems interviewsIndustries Technology, Software development

Question

Design a stateful stream-processing job that maintains exactly-once rolling 1-hour aggregates (count, sum, distinct-users) over a 200k events/sec firehose, where the aggregates are served to customers and bill them — so an over-count is a refund and an under-count is lost revenue. The job runs on a cluster that experiences periodic node failures, autoscaling-driven rebalances, and weekly deploys. Walk through how state, checkpointing, and the output sink combine to guarantee no event is counted twice or zero times across all those disruptions.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.