Code Room
On-callMediumoc-g598
Subject Security tls cert expiryLevel Mid–Senior~25 minCommon in Security · Reliability & on-call interviewsIndustries Technology, IT services

Question

At 00:03 UTC your synthetic monitors go red and the support queue floods with 'your site is not secure' and NET::ERR_CERT_DATE_INVALID screenshots. Browsers and your mobile app both refuse to connect to api.yourapp.com. The load balancers are healthy, CPU is flat, and there was no deploy. Checking the cert with openssl s_client shows notAfter was yesterday at 23:59 UTC. You're on call. How do you triage, restore service, and prevent a repeat?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.