Code Room
System designHardsd-g356
Subject Vector databaseLevel Senior–Staff~50 minCommon in Databases & SQL interviewsIndustries Technology, Software development

Question

Design how a vector search service rebuilds its index without downtime. You host 2B 1024-dim vectors across a sharded HNSW index serving 8k QPS at 25ms p99. Periodically you must rebuild — because the embedding model was upgraded, because deletes have fragmented the graph and recall has degraded, or because you're re-sharding. The rebuild takes hours and is resource-heavy. Walk through how you rebuild a shard while it keeps serving, how you cut over without a recall cliff or a latency spike, and how you handle writes that arrive during the multi-hour rebuild.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.