Code Room
System designHard
Question
Design the indexing subsystem for a code-search engine over 100M repositories (~2B files, 50TB of source). It must support exact substring/regex-ish matches plus symbol search, ingest pushes within minutes of a commit, and handle that 30% of files change weekly. Query p95 under 500ms across the corpus.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.