Question
A deploy bumps one direct dependency (`http-client` 4.2 → 4.3) for a bugfix. Build and tests pass. After deploy, ~2% of outbound calls to one specific payment partner fail with `tls: failed to verify certificate: x509: certificate signed by unknown authority` — only to that partner, only intermittently, and only from pods that have been up a while. Other partners are fine. Dashboards: outbound error rate to that one host stepped up at deploy time; CPU/mem normal. The direct dep changelog mentions nothing relevant. The lockfile diff shows that `http-client` 4.3 relaxed a constraint and pulled in a *transitive* TLS library upgrade (`tls-core` 1.8 → 2.0), whose 2.0 default dropped a set of legacy intermediate CAs from its bundled trust store. How do you triage and mitigate?
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.