Code Room
System designHardsd-g678
Subject SecurityLevel Senior–Staff~45 minCommon in Security interviewsIndustries Technology, IT services

Question

Design a data-loss-prevention (DLP) scanner for a large enterprise SaaS that inspects every outbound document, message, and API payload (200K events/s, files up to 1GB) for sensitive data — PII, PCI card numbers, source code, credentials — and blocks or quarantines policy violations in line with under 300ms added latency for interactive flows. Threat model: insiders exfiltrating data slowly, and accidental leaks via misconfigured shares. It must support content-aware policies (exact-data-match against a customer's own datasets) and minimize false positives that would break legitimate work. Cover classification pipeline, the exact-match index, and how you handle large files and encryption.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.