On-callMediumoc-g616

Subject Database disk saturationLevel Mid–Senior~35 minCommon in Databases & SQL · Storage & CDN interviewsIndustries Technology

Question

Alert at 21:30: "DB host disk > 92% and climbing ~1%/min." The Postgres primary is on a 500GB volume. Writes are slowing and you're minutes from the disk filling, at which point Postgres will refuse writes and could crash. The disk dashboard shows the `pg_wal` directory growing fast — it's now 180GB, far above normal. Recent context: a replica went offline for maintenance 3 hours ago and hasn't reconnected, and someone set up a logical replication slot last week for a new analytics pipeline that's been failing silently. Table sizes themselves look normal. Triage and act fast.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Learn the concepts

Diagram & narrate the incident

Loading whiteboard…

Run or narrate your approach, then ask the coach.