System designHardsd-g504

Subject Blob storeLevel Senior–Staff~45 minCommon in Databases & SQL · Storage & CDN · Distributed systems interviewsIndustries Technology, Software development

Question

Design the garbage collection / orphan-reaping subsystem for a large blob store where the metadata layer (a database mapping logical keys -> physical blob locations) and the data layer (the actual stored blobs across thousands of storage nodes) can drift out of sync. Orphans accumulate: a blob written but its metadata commit failed, a metadata entry deleted but the blob not yet freed, multipart uploads abandoned mid-flight. At 50PB you can't afford to leak storage, but you also must NEVER delete a blob that's actually still referenced. Design a safe, scalable reaper.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.