Temp file leaks

A temp file is a guest, not a tenant — if nobody shows it the door, it stays forever.

The idea

Lots of jobs write a scratch file: an upload buffer, a resized image, a render of a PDF. The plan is always “use it, then delete it.” But when a request throws an exception between creating the file and deleting it, the delete never runs. The file is orphaned on disk.

One orphan is harmless. A leak is the same bug repeating thousands of times a day. Disk usage creeps up, inodes run out, and one morning the partition is full — and the service that mattered can no longer write. The fix is to make cleanup guaranteed, not best-effort: tie the file's lifetime to a scope that always unwinds.

Cleanup:

Pick a cleanup strategy, then press Play to run 8 requests. Some will fail mid-job.

How it works

The leaky version deletes only on the happy path, so any exception skips the delete. The robust version binds the temp file to a scope that the language guarantees will unwind — a with block, try/finally, or a self-deleting temp handle. When the scope exits, normally or by exception, the file is removed.

# Leaky: delete only runs if nothing throws
def handle(req):
    path = make_temp()
    process(req, path)   # raises? path is orphaned
    os.remove(path)      # never reached on error

# Robust: cleanup is tied to scope exit
import tempfile, os
def handle(req):
    fd, path = tempfile.mkstemp()
    try:
        process(req, path)
    finally:
        os.close(fd)
        os.remove(path)  # runs on success AND on exception

# Even simpler: a self-deleting temp file
def handle(req):
    with tempfile.NamedTemporaryFile() as tmp:
        process(req, tmp.name)
    # file is gone the moment the block exits

A reaper job (delete temp files older than N hours) is a good safety net, but it is a backstop, not the primary fix.

Signals

Symptom	What it usually means
Disk usage rises monotonically	Temp files created but not removed
“No space left” with small real data	Orphans filling the partition
“No inodes” though disk has free bytes	Many tiny orphaned files
Leak rate tracks the error rate	Cleanup is on the happy path only

Watch out for

Deleting only after success. The whole point of a leak is that errors are exactly when delete gets skipped.
Putting os.remove at the end of the function instead of in finally — an early return or raise jumps right over it.
Reusing a fixed temp name like /tmp/work.tmp — concurrent requests stomp each other and you get corruption on top of the leak.
Trusting the OS to clear /tmp on reboot. Long-running servers may not reboot for months, and the disk fills first.
A reaper that deletes by age without checking the file is unused — it may yank a file a slow job is still writing.

Worked example

An image service writes a temp file per upload and deletes it after a successful resize. The resize throws on corrupt images about 2% of the time. At 100,000 uploads/day that's ~2,000 orphans/day, each a few hundred KB. Within a couple of weeks the 50 GB scratch volume is full and all uploads start failing — even the valid ones. Wrapping the resize in a finally that removes the file drops the leak to zero overnight.

Check yourself

You move os.remove(path) to the very last line of the function. Is the leak fixed?