Code Room
System designHardsd-g073
Subject Inverted indexLevel Senior–Staff~50 minCommon in Databases & SQL · Distributed systems interviewsIndustries Technology, Software development

Question

Design the indexing subsystem for a code-search engine over 100M repositories (~2B files, 50TB of source). It must support exact substring/regex-ish matches plus symbol search, ingest pushes within minutes of a commit, and handle that 30% of files change weekly. Query p95 under 500ms across the corpus.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.