Code Room
On-callMediumoc-g326
Subject ThrottlingLevel Mid–Senior~25 minCommon in Reliability & on-call interviewsIndustries Technology

Question

Your backend calls an internal inventory API that enforces a 2,000 rps per-service quota. After a client-library upgrade shipped this morning, your inventory calls start getting throttled (429s) even though your user traffic is flat. Dashboards show your outbound inventory request rate doubled at the exact deploy time — with no increase in incoming user requests. The new client library changed the default: a failed call now retries up to 5 times with no backoff, and a borderline-slow inventory shard causes a trickle of failures. How do you triage and mitigate?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.