Code Room
On-callMediumoc-g578
Subject Model serving dependency fallbackLevel Mid–Senior~30 minCommon in ML systems · Code quality & review interviewsIndustries Technology

Question

Your real-time inference service enriches each request by calling a third-party 'identity-risk' API synchronously to build one feature before scoring. At 12:40 your service's p99 latency triples and error rate hits 4%. Tracing shows the time and errors are all in the outbound call to the third-party API, which is returning slow responses and intermittent 503s; their status page later confirms a partial outage. Your own model and feature store are healthy. Triage and mitigate.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.