Question
PagerDuty fires: Postgres is logging `WARNING: database "prod" must be vacuumed within 8400000 transactions`. `SELECT datname, age(datfrozenxid) FROM pg_database` shows your main DB at 1.94 billion and climbing. A few tables are sitting in the high hundreds of millions on `age(relfrozenxid)`. There's a huge append-only `events` table that nobody ever updates or deletes, and a couple of tables owned by a service that runs hours-long transactions. Autovacuum is enabled with defaults. You're roughly 6 hours from the 2-billion shutdown threshold if the rate holds. Triage and mitigate.
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.