System designHardsd-g591

Subject Ml content moderationLevel Senior–Staff~55 minCommon in ML systems interviewsIndustries Technology

Question

Design a content-moderation ML pipeline for a platform ingesting 50M pieces of user content/day (text, images, video) that must detect policy violations (hate, violence, CSAM, spam) with both proactive (pre-publish) and reactive (post-report) paths. Pre-publish scoring for text/images must complete within 300ms; video can be asynchronous. Cover the model architecture, the human-review loop, and how you trade off catching harm against over-removal.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.