Question
Your production service and a data-science batch pipeline both call a cloud vendor API (e.g. a Vision/Translation API) — and unknown to you, they authenticate under the SAME cloud PROJECT, which holds a single per-project quota. At 06:00 a one-off DS backfill kicks off. Within minutes your production realtime calls start getting HTTP 429 'quota exceeded for quota metric ... per minute per project,' even though your service's OWN call volume is unchanged. Dashboards: your service's call rate is flat; the project-level quota dashboard is pegged at 100%; the spike is entirely from the DS pipeline's project. No deploy on your side. How do you triage and mitigate?
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.