Code Room
On-callHardoc-g232
Subject Query plan regressionLevel Senior–Staff~35 minCommon in Databases & SQL · Distributed systems interviewsIndustries Technology, Software development

Question

Postgres with a `metrics` table range-partitioned by `event_date` into daily partitions (2 years of data). After a deploy that changed an ORM, a dashboard query that filters `WHERE event_date >= now() - interval '7 days'` went from 80 ms to 12 s. `EXPLAIN` shows it now scanning ALL ~730 partitions instead of the last 7. The query text looks logically the same. CPU is fine; it's pure work amplification. Triage, fix now, and prevent.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.