Counting how much storage a tenant uses

Keep one honest number — bytes used — and nudge it on every write and delete, so you always know whether the next write still fits.

The idea

A storage quota caps how much a tenant, account, or bucket may hold. The system keeps a running used counter: every write adds the object's size, every delete subtracts it, and an overwrite adjusts by the delta between old and new size.

A write is allowed only when used + size <= limit. The whole scheme depends on that counter staying honest, so the update has to be atomic — otherwise two concurrent writes can step on each other and the number drifts away from reality. A soft limit warns as you approach; a hard limit rejects the write outright.

See it work

A tenant with a 10 GB limit. Step through a stream of operations.

How it works

The counter update is the whole game. On a write you reserve space and the counter; on a delete you release it. Both must be a single atomic operation against shared state so concurrent callers can't lose each other's updates.

# write: reserve size only if it still fits, atomically
def try_write(used, size, limit):
    # compare-and-add in one indivisible step (DB transaction,
    # Redis Lua, or atomic compare_exchange) — never read-then-write
    if atomic_add_if(used, size, max=limit):   # used += size iff used+size <= limit
        store_object(size)
        return "ok"
    return "rejected: would exceed limit"

# delete: always release what we accounted for
def on_delete(used, size):
    atomic_sub(used, size)                     # used -= size

# overwrite is just the delta
def on_overwrite(used, old_size, new_size):
    atomic_add(used, new_size - old_size)      # may be + or -

# EXACT accounting: the counter is updated synchronously inside the write
#   path — always correct, but the counter is a hot, contended row.
# APPROXIMATE accounting: writes update a fast local/sharded tally and a
#   background job rolls it up — cheap, but can briefly over/under-count.
# RECONCILIATION: a periodic scan re-sums true object sizes and overwrites
#   the counter, healing any drift from crashes or non-atomic paths.

Cost

What it costs	Why
Counter contention	Exact accounting touches one hot counter on every write — a serialization point that caps write throughput per tenant.
Exact vs approximate	Exact is always correct but contended; approximate (sharded or async tallies) is cheap but can briefly over- or under-count near the limit.
Reconciliation scan	Re-summing true usage means listing every object — O(objects) work, so it runs periodically (nightly), not per request.
Drift	Any gap between the counter and true on-disk usage — from crashes, races, or missed decrements — costs either over-charging tenants or letting them exceed the cap.
Signals to watch	`used / limit` ratio (warn band), reconciliation `drift = counter − scanned`, and the rejected-write rate (tenants hitting the wall).

Watch out for

Non-atomic increment under concurrency. A read-then-write (used = used + size) lets two writers read the same value and one update vanishes — a lost update that drifts the counter low.
Forgetting to decrement. A delete path that skips used -= size, or failed multipart parts that are never cleaned up, leaves phantom usage that slowly fills the quota with nothing.
Logical vs physical size mismatch. Charging the bytes the user uploaded while disk holds compressed or replicated copies (or vice versa) makes the counter mean something different from what's actually stored.
One global counter. A single shared counter for a busy tenant becomes a write bottleneck — shard it across N sub-counters and sum them when you need the total.
Soft limit with no teeth. A warn-only soft limit that nothing enforces lets a tenant sail past it; pair it with a hard limit, and reconcile so drift never accumulates unnoticed.

Worked example

A tenant has a 10 GB limit. A run of writes brings used to 7 GB — past a 7.5 GB soft line we'd warn on soon. The tenant deletes a 2 GB object, so used drops to 5 GB and there's room to grow again.

But a buggy concurrent delete path skips its decrement, so the counter still reads high while real usage is lower — that gap is drift. A nightly reconciliation re-scans the actual objects, finds 5 GB of real data, and overwrites the counter back to 5 GB. The accounting is honest again, and the next write is judged against the true number.

Check yourself

1. Two writes of 1 GB each arrive at the same instant for a tenant at used = 8 GB, limit = 10 GB. The code does used = used + size (a plain read-then-write). What's the danger?

2. After crashes left the counter reading higher than what's actually stored, what fixes it?