Code Room
On-callMediumoc-g063
Subject Config changeLevel Mid–Senior~30 minCommon in Security interviewsIndustries Technology, Software development

Question

At 00:00 UTC, every service-to-service call to the internal 'identity' service starts failing with TLS handshake errors ('certificate has expired'), and login across the whole platform breaks. There was no deploy. Identity's own pods are up and healthy on their dashboards; the failures are all on the client side of mTLS. A cert-rotation automation that normally renews internal certs 30 days early logged a failure 31 days ago that nobody acted on. Triage and mitigate.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.