Why a 3rd-party analytics API going down can take your entire checkout page with it.
Your app likely calls external APIs (Stripe, Twilio, Analytics). If an external API returns a fast "500 Error", your app can just show an error message and move on. That is easy.
The deadly scenario is a Slow Dependency. If the vendor API starts taking 60 seconds to respond, your web server leaves the connection open, waiting. Very quickly, every single worker thread on your server gets stuck waiting on the vendor. Now your server cannot serve any requests (even healthy ones). You must configure aggressive Timeouts and Fallbacks for all external network calls.
# BAD: Default or missing timeouts
# The requests library will wait FOREVER by default!
# If the vendor hangs, this thread is permanently dead.
response = requests.get("https://api.vendor.com/data")
# GOOD: Strict timeouts
try:
# Fail fast if the vendor takes more than 1 second.
response = requests.get("https://api.vendor.com/data", timeout=1.0)
data = response.json()
except requests.exceptions.Timeout:
# Fallback: Serve stale data from cache, or disable feature.
# We sacrificed the feature, but SAVED THE SERVER.
data = {"status": "degraded_mode"}