Code Room
On-callHard
Question
Your events table is range-partitioned by day across ~400 daily partitions. A dashboard query that filters `WHERE event_time >= now() - interval '1 day'` used to scan one or two partitions and return in 150ms; since a 'small refactor' deploy it's gotten slow and the DB is reading far more than before. `EXPLAIN ANALYZE` now shows the query scanning ALL 400 partitions instead of pruning to the recent ones. The data, indexes, and stats are unchanged. The refactor wrapped the time filter in a helper that computes the cutoff. Triage and mitigate.
What a strong answer looks like
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.