Code Room
On-callMediumoc-g101
Subject Traffic surgeLevel Mid–Senior~35 minCommon in Networking & APIs interviewsIndustries Technology

Question

Tickets for a major concert go on sale at exactly 10:00. At 10:00:00 traffic jumps 50x instantly. Your ALB starts returning 503s, the autoscaler is adding instances but they take ~4 minutes to boot and pass health checks, and during those 4 minutes the small initial fleet is overwhelmed and crashes. The queue page itself is dynamic and hits the same backend as checkout. Nothing is broken in code; you simply got a wall of traffic at a known instant. How do you triage in the moment and design for the next on-sale?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.