Code Room
On-callHard
Question
A backend service that fans out to a downstream API starts intermittently failing to make outbound connections under peak load: logs show 'cannot assign requested address' (EADDRNOTAVAIL) on connect, and request error rate spikes during traffic peaks then recovers in troughs. The host is not CPU- or memory-bound. `ss -s` shows tens of thousands of sockets in TIME_WAIT toward the downstream's IP:port. A recent change replaced a pooled/keep-alive HTTP client with one that opens a fresh connection per request. The ephemeral port range is the default. Triage and fix.
What a strong answer looks like
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.