Question
Your API talks to Postgres through PgBouncer in transaction pooling mode, with a small server-connection pool. Most queries are fast (1-5ms). A new reporting endpoint runs a heavy 2-8 second analytical query against the same database and the same PgBouncer pool. After it ships, you see correlated multi-second latency spikes on completely unrelated fast endpoints, and PgBouncer's 'client waiting' (cl_waiting) count climbs during the spikes. The database itself isn't CPU-bound and the fast queries are still individually fast when they finally run. How do you triage and mitigate?
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.