Code Room
System designHardsd-g445
Subject Leader electionLevel Senior–Staff~45 minCommon in Distributed systems interviewsIndustries Technology

Question

A metrics-ingestion pipeline has a single 'compactor' role that merges newly-written time-series segments into larger files in object storage; running two compactors at once corrupts the index (lost/overwritten segments). The compactor leader is elected via a lease in a coordination service. Last week a 25-second network partition isolated the leader from the coordination service: a new leader was elected and began compacting, but the old leader — which never crashed and never saw the partition end signal — was paused in GC and resumed mid-compaction, writing to object storage with a stale belief that it was still leader. Two compactors ran. Design the election + fencing so this can never corrupt the index, even with arbitrary pauses, clock skew, and the coordination service itself failing over.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.