Code Room
On-callHardoc-g270
Subject Dependency upgradeLevel Mid–Staff~35 minCommon in Code quality & review interviewsIndustries Technology, Software development

Question

A deploy bumps one direct dependency (`http-client` 4.2 → 4.3) for a bugfix. Build and tests pass. After deploy, ~2% of outbound calls to one specific payment partner fail with `tls: failed to verify certificate: x509: certificate signed by unknown authority` — only to that partner, only intermittently, and only from pods that have been up a while. Other partners are fine. Dashboards: outbound error rate to that one host stepped up at deploy time; CPU/mem normal. The direct dep changelog mentions nothing relevant. The lockfile diff shows that `http-client` 4.3 relaxed a constraint and pulled in a *transitive* TLS library upgrade (`tls-core` 1.8 → 2.0), whose 2.0 default dropped a set of legacy intermediate CAs from its bundled trust store. How do you triage and mitigate?

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.