Code Room
System designHardsd-g343
Subject Ml system designLevel Senior–Staff~50 minCommon in ML systems · Databases & SQL interviewsIndustries Technology

Question

Design the operational lifecycle of an ANN (approximate nearest neighbor) index for a semantic-retrieval service holding 500M item embeddings, serving 20k QPS at p99 < 30ms, where ~5M items are added/changed and ~2M removed every day. The hard part isn't the query — HNSW handles that — it's keeping the index correct and fresh: HNSW graphs don't support efficient deletes, full rebuilds take hours, and a stale index silently returns retired/deleted items. Design the update and rebuild strategy.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.