Question
RabbitMQ setup: a `payments` work queue dead-letters (via DLX) to a `payments-retry` queue with a message TTL of 60s, and that retry queue dead-letters *back* to `payments` for another attempt (a delayed-retry pattern). At 08:00, broker CPU and message rates spike: the `payments-retry` queue depth oscillates around 80k, the `payments` queue shows a huge `redelivered` rate, and the broker's overall publish+deliver throughput is ~10x normal with no increase in real business traffic. Recent context: a downstream payment processor started returning errors at 07:45 for ~30% of charges. Triage and explain why the broker is melting, and how you mitigate.
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.