Snapshots are cheap to take but they keep old blocks alive — take enough and never prune, and they quietly eat the disk until writes start failing.
A copy-on-write snapshot captures a point-in-time view almost for free: it shares every block with the live volume and only diverges as the live data changes. The catch is that each retained snapshot pins the old version of every block the live volume later rewrites, so those old blocks can never be freed.
Take snapshots frequently and never prune them, and disk usage grows with churn × retention — not with the size of the live data, which can sit flat. When free space crosses a threshold the system hits disk pressure: writes slow, then fail, the database may flip to read-only, and the node may be evicted. The fix is a retention policy: prune or merge old snapshots, and alert on free space and snapshot count, not just live-data size.
A copy-on-write write never overwrites a block that a snapshot still references. Instead it copies the old block aside (the snapshot keeps pointing at the copy) and writes the new data to a fresh block. A retention loop reclaims space only by dropping snapshots old enough that no live snapshot still references their pinned blocks.
// copy-on-write: never clobber a block a snapshot still needs
function cow_write(volume, block_id, new_data):
if any_snapshot_references(block_id): // shared with a snapshot
old = volume.blocks[block_id]
copy = allocate_new_block() // pins old version on disk
copy.data = old.data
for snap in snapshots_referencing(block_id):
snap.remap(block_id -> copy) // snapshot keeps the old view
volume.blocks[block_id] = write(new_data) // live volume diverges
// retention: prune old snapshots, free only un-shared blocks
function prune(snapshots, max_age_days):
for snap in snapshots:
if snap.age > max_age_days:
drop(snap)
for blk in pinned_blocks():
if not any_snapshot_references(blk): // nothing left needs it
free(blk) // space finally returns
| Aspect | Cost | Signal to watch |
|---|---|---|
| Snapshot creation | O(1) — just a new reference, no data copied | Snapshot count climbing without bound |
| Space | Grows with churn × retention, not live-data size | Used space far above live-data size |
| Performance | Copy-on-write write amplification and fragmentation | Write latency creeping up over time |
| Deletion | Pruning frees only blocks no remaining snapshot shares | Reclaimed space far below the snapshot’s logical size |
| Recovery time | Point-in-time restore is fast, but each kept point costs space | Retention depth vs free-space headroom |
Take a 100 GB volume with 5% daily churn and an hourly snapshot kept for 30 days. Each day rewrites about 5 GB of blocks; under a snapshot, every rewrite pins the old version, so roughly 5 GB of new pinned space accrues per day on top of the steady 100 GB of live data. Over the 30-day retention window that is about 30 × 5 = 150 GB of snapshot-pinned blocks — about 250 GB used total while the live data never leaves 100 GB.
On a 256 GB volume that crosses the warning watermark within roughly two weeks and fills near day 30, which is exactly the curve in the animation: the live-data band stays flat while the snapshot band climbs tick by tick, tips the bar from healthy green to warning warm and into disk pressure — until a prune drops the oldest snapshots and their uniquely-pinned blocks finally free.
Live data is steady at 100 GB, but the disk keeps filling toward full. What is the most likely cause?
You delete the oldest snapshot to recover space, but barely any frees. Why?