Question
Design the garbage collection / orphan-reaping subsystem for a large blob store where the metadata layer (a database mapping logical keys -> physical blob locations) and the data layer (the actual stored blobs across thousands of storage nodes) can drift out of sync. Orphans accumulate: a blob written but its metadata commit failed, a metadata entry deleted but the blob not yet freed, multipart uploads abandoned mid-flight. At 50PB you can't afford to leak storage, but you also must NEVER delete a blob that's actually still referenced. Design a safe, scalable reaper.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.