Code Room
On-callMedium
Question
Customer-support tickets spike: users say their account page shows wrong or blank values for one field (their plan name). The service is up, error rate is normal, and latency is fine — no 500s at all. The only recent change is a deploy two hours ago that touched how that field is read and written. How do you triage a 'no errors but wrong output' incident?
What a strong answer looks like
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.