Code Room
System designHard
Question
Design a video search / scene-indexing system that lets users search WITHIN videos by natural-language description of what happens ('the part where someone opens a red door', 'goal celebration') across a corpus of 500M videos. You need to detect shots/scenes, generate searchable representations of visual + audio + spoken content, and return ranked timestamps (deep links into the video) at query time with sub-second latency, all while keeping (re)indexing cost sane.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.