System designHardsd-g031

Subject CompactionLevel Senior–Staff~45 minCommon in Distributed systems interviewsIndustries Technology

Question

An LSM-based key-value store backing a metadata service is suffering: disk usage is 2.3x the live dataset, read p99 has crept to 40ms, and deletes don't seem to free space for days. Workload is heavy overwrites of the same keys (config/state that updates frequently) plus a steady stream of deletes (TTL expiry), uniform key distribution, dataset ~3TB. Design a compaction strategy (and the surrounding knobs) to fix space amplification, restore read latency, and make deletes reclaim space promptly. Explain the trade-offs versus the alternative strategy.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.