System designHardsd-g355

Subject Rag llm infraLevel Senior–Staff~50 minCommon in ML systems · Databases & SQL interviewsIndustries Technology, Software development

Question

Design a RAG backend for a news-and-research assistant where freshness is the product: a document published two minutes ago must be retrievable, and answers must prefer recent, authoritative sources. Corpus is ~200M documents growing by ~500k/day, queries are ~2k QPS with a 400ms retrieval budget before the LLM call. Walk through the ingestion-to-retrievable pipeline that keeps the index fresh, how you combine lexical and dense retrieval, and how a re-ranking stage balances semantic relevance against recency and source authority without re-embedding the whole corpus on every edit.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.