Code Room
System designHardsd-g019
Subject Event streamingLevel Senior–Staff~50 minCommon in Distributed systems interviewsIndustries Technology

Question

Design a real-time streaming-aggregation system that computes per-minute and per-hour metrics (request counts, p99 latency, error rates) from a 3M-events/sec ingestion firehose, powering live dashboards and alerting. Requirements: aggregates update within a few seconds, events can arrive seconds-to-minutes late or out of order, results must be correct (not double-counted) across worker restarts, and the system must handle a sudden 5x traffic spike. Walk through windowing, lateness, and correctness.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.