Question
Design the bloom-filter strategy for an LSM key-value store where reads frequently look up keys that don't exist (a cache-miss path that checks the store, and a workload with many 'does this key exist?' probes). Without filters every negative lookup scans all levels' SSTables to disk. You have a fixed RAM budget for filters across a 10TB dataset with ~5B keys, and read p99 is dominated by how many SSTables a lookup must touch. Decide bits-per-key and where filters live, and explain the trade-off you're tuning and the failure mode if you get it wrong.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.