System designMediumsd-g586

Subject Embedding retrievalLevel Mid–Senior~45 minCommon in ML systems · Databases & SQL interviewsIndustries Technology

Question

Design a vector-search service backing a semantic product/document retrieval system: 200M embeddings (768-dim), 10k QPS of top-50 nearest-neighbor queries with a 50ms p99 budget, frequent inserts/deletes as the corpus changes, and a requirement to filter by structured metadata (category, language, in-stock) at query time. Cover the index choice, sharding, how you handle updates and metadata filtering, and the recall/latency trade-off.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.