Code Room
System designHard
Question
Design a distributed lock service that lets thousands of stateless app servers serialize access to per-resource critical sections (e.g. "only one worker rebuilds cache key K at a time"). Targets: ~50k lock acquisitions/sec, p99 acquire latency under 10ms, locks auto-release if a holder crashes, and correctness must survive a held lock outliving a GC pause or network partition. Walk the API, how a lock is granted and reclaimed, and how you stop two clients from believing they both hold the same lock.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.