Code Room
System designHardsd-g517
Subject Event streamingLevel Senior–Staff~45 minCommon in Distributed systems interviewsIndustries Technology, Software development

Question

Design the partition-assignment and rebalancing protocol for a large consumer group on a partitioned event stream. 2,000 partitions, 200 consumer instances, autoscaling so consumers join/leave every few minutes. Problem: every join/leave currently triggers a 'stop the world' rebalance where all consumers pause and reshuffle, causing latency spikes and reprocessing. Make rebalances cheap and minimize partition movement while keeping assignment balanced.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.