Code Room
System designHardsd-g421
Subject Batch vs streamLevel Senior–Staff~40 minCommon in Distributed systems interviewsIndustries Technology

Question

Design the real-time analytics pipeline for a mobile app that aggregates per-minute active-user and event counts from clients that frequently go offline. Events carry a client-side event_time but can arrive minutes-to-hours late (subway rides, airplane mode, flaky networks); some arrive out of order within a single device. You must publish per-minute aggregates within ~10 seconds for a live ops dashboard, but the numbers must also eventually be correct when the stragglers land. How do you handle watermarks, lateness, and the inevitable correction?

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.