Code Room
On-callHardoc-g491
Subject Api quotaLevel Senior–Staff~35 minCommon in Reliability & on-call interviewsIndustries Technology, Software development

Question

Your service calls a cloud vendor API with a per-minute quota you normally use ~30% of. At 11:40 you start getting 429 'quota exceeded' from the vendor and the affected feature errors at ~50%. Dashboards: your OUTBOUND call rate to the vendor jumped 4x starting at 11:38, but inbound user traffic is flat. Around 11:37 the vendor had a 90-second blip of 503s (now recovered). Your client retries failed vendor calls up to 5 times with a fixed 200ms backoff. No deploy. How do you triage and mitigate?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.