On-callMediumoc-g579

Subject Disk fillingLevel Mid–Senior~30 minCommon in Storage & CDN interviewsIndustries Technology

Question

PagerDuty fires DiskUsageHigh on three of your eight primary Postgres nodes: the data volume is at 91% and climbing ~1.5%/hour. The capacity dashboard shows free space dropping in a straight line that started ~6 hours ago; throughput and connection count look normal. A feature flag enabling verbose per-request audit logging to a table was rolled out yesterday afternoon, and a nightly VACUUM job last completed two nights ago. If the volume hits 100%, Postgres will refuse writes and the app goes read-only. Walk me through how you keep the database writable in the next hour and then fix it durably.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Learn the concepts

Diagram & narrate the incident

Loading whiteboard…

Run or narrate your approach, then ask the coach.