Question
In one newly added region, a fraction of TCP connections to an internal service establish but then hang and time out, while the same service is rock-solid in every other region. Captures on the server show SYN received and SYN-ACK sent, but it never sees the client's ACK or subsequent data for the hung connections — the handshake doesn't complete from the server's view. Captures on the client show it sent the ACK. The region was stood up last week with a new network design that added a second egress path / additional gateways and a stateful firewall. Some connections work; others hang. Triage, mitigate, and explain the root cause.
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.