Code Room
System designMedium
Question
Design a vector-search service backing a semantic product/document retrieval system: 200M embeddings (768-dim), 10k QPS of top-50 nearest-neighbor queries with a 50ms p99 budget, frequent inserts/deletes as the corpus changes, and a requirement to filter by structured metadata (category, language, in-stock) at query time. Cover the index choice, sharding, how you handle updates and metadata filtering, and the recall/latency trade-off.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.