Code Room
On-callMedium
Question
A service that talks to a database starts intermittently failing requests with 'timeout acquiring connection from pool' errors, getting worse over a couple of hours after a deploy this morning. The database itself looks healthy (low CPU, normal query times). Traffic is normal. The pool dashboard shows in-use connections pinned at the max while idle connections sit at zero. How do you triage?
What a strong answer looks like
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.