System designHardsd-g092

Subject Vector databaseLevel Senior–Staff~50 minCommon in Databases & SQL · Distributed systems interviewsIndustries Technology

Question

Design a vector search service that holds 5 billion 768-dim embeddings (image embeddings for a visual-search product), serving 10k QPS of nearest-neighbor queries at p99 < 50ms with recall@10 ≥ 0.95. The corpus grows by ~50M vectors/day and old vectors are occasionally deleted. Explain the index choice, how you shard across machines so a single query stays fast, the memory/cost math that drives the design, and how you handle continuous inserts and deletes without recall collapsing.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.