Question
A Node.js API that calls an internal pricing service starts adding intermittent +20ms to +300ms to a fraction of requests, in waves. The slow time is all spent before the first byte of the outbound HTTP request is sent — the pricing service reports fast service times. CPU/memory fine. The recent change: a refactor replaced a long-lived HTTP client (keep-alive, cached connection) with a fresh `fetch`/request per call to a hostname like `pricing.internal.svc`. Your internal DNS is healthy and not erroring. Conntrack and ephemeral ports are fine. How do you triage and what's the fix?
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.