Code Room
On-callHardoc-g134
Subject DdosLevel Senior–Staff~45 minCommon in Reliability & on-call interviewsIndustries Technology, Software development

Question

Your checkout API p99 jumps from 180ms to 9s and origin 5xx climbs to 30%. The edge (Cloudflare) dashboard shows requests/sec up 14x but bandwidth only up ~2x — small requests. The flood targets POST /api/cart/apply-coupon with valid-looking JSON, rotating across ~120k residential IPs, each doing 2-3 req/min so no single IP trips rate limits. The coupon endpoint does a synchronous DB lookup + a write to a 'coupon_attempts' table. A marketing 'flash sale' launched 20 minutes ago. How do you triage and mitigate this L7 attack vs. legitimate flash-sale traffic?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.