Code Room
System designMedium
Question
Design the automatic caption/subtitle generation and alignment pipeline for a video platform that must produce time-synced captions (and translations into 30+ languages) for every uploaded video, including live streams where captions must appear within ~3s of the spoken word. Captions must be accurately word-timed for highlighting, support creator edits without re-running ASR, and handle a backlog of millions of pre-existing videos.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.