Code Room
On-callHardoc-g077
Subject Dns failureLevel Senior–Staff~40 minCommon in Security · Storage & CDN · Networking & APIs interviewsIndustries Technology

Question

At 15:40 your public site becomes unreachable for a large fraction of users worldwide, while a smaller fraction load it fine. Dashboards: your origin and CDN edges are healthy and lightly loaded — almost no traffic is arriving at all. Synthetic checks from several regions fail at the DNS-resolution step ('SERVFAIL'). Your authoritative DNS is hosted by a single managed DNS provider, and that provider's status page reports it is mitigating a large DDoS against its anycast network. Recent context: none on your side; no deploy. How do you triage and mitigate?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.