Code Room
System designMedium
Question
Design a webhook delivery system for a SaaS platform that must dispatch ~50M events/day to ~200K customer-registered HTTPS endpoints. Endpoints are flaky (third-party servers with p99 latency up to 8s and intermittent 5xx). Requirements: at-least-once delivery, retries with exponential backoff over up to 24h, per-endpoint ordering is NOT required, signed payloads, and a customer-visible delivery log. A single slow customer endpoint must not block delivery to everyone else.
What a strong answer looks like
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.