Code Room
System designHard
Question
Design near-real-time indexing for a search system where new and updated documents must become searchable within ~5 seconds of the write, while the cluster serves 50k queries/sec over 2B documents. Writes come at 40k docs/sec including frequent updates and deletes (not just appends). The naive 'rebuild the index' approach is far too slow. How do you make a write visible in seconds without tanking query latency or memory, and how do you handle updates/deletes when inverted-index postings are append-only?
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.