Code Room
On-callMedium
Question
An orders API (Ruby on Rails, Postgres) degrades gradually over two weeks — nobody noticed a single bad deploy. p99 on GET /orders crept from 120ms to 3s. The slow-query log now shows the order-history query (filter by `customer_id`, order by `created_at`) doing a sequential scan over the orders table, which has grown from 2M to 30M rows after a successful growth quarter. `EXPLAIN ANALYZE` confirms a Seq Scan + sort. The table has a primary key on `id` but no other indexes. Triage and mitigate.
What a strong answer looks like
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.
Learn the concepts
Loading whiteboard…
Run or narrate your approach, then ask the coach.