Code Room
System designHard
Question
Design a content-matching / copyright-ID system (Content-ID style) that fingerprints every uploaded video and audio track and matches it against a reference catalog of tens of millions of copyrighted works. Every new upload (hundreds of hours per minute) must be checked, partial and modified matches (pitch-shifted audio, cropped/recompressed video, a 10s clip inside a 30m vlog) must be caught, and a match must trigger a claim. Design the fingerprinting, the index, and the matching pipeline.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.