Keep one honest number — bytes used — and nudge it on every write and delete, so you always know whether the next write still fits.
A storage quota caps how much a tenant, account, or bucket may hold. The system keeps a running used counter: every write adds the object's size, every delete subtracts it, and an overwrite adjusts by the delta between old and new size.
A write is allowed only when used + size <= limit. The whole scheme depends on that counter staying honest, so the update has to be atomic — otherwise two concurrent writes can step on each other and the number drifts away from reality. A soft limit warns as you approach; a hard limit rejects the write outright.
The counter update is the whole game. On a write you reserve space and the counter; on a delete you release it. Both must be a single atomic operation against shared state so concurrent callers can't lose each other's updates.
# write: reserve size only if it still fits, atomically
def try_write(used, size, limit):
# compare-and-add in one indivisible step (DB transaction,
# Redis Lua, or atomic compare_exchange) — never read-then-write
if atomic_add_if(used, size, max=limit): # used += size iff used+size <= limit
store_object(size)
return "ok"
return "rejected: would exceed limit"
# delete: always release what we accounted for
def on_delete(used, size):
atomic_sub(used, size) # used -= size
# overwrite is just the delta
def on_overwrite(used, old_size, new_size):
atomic_add(used, new_size - old_size) # may be + or -
# EXACT accounting: the counter is updated synchronously inside the write
# path — always correct, but the counter is a hot, contended row.
# APPROXIMATE accounting: writes update a fast local/sharded tally and a
# background job rolls it up — cheap, but can briefly over/under-count.
# RECONCILIATION: a periodic scan re-sums true object sizes and overwrites
# the counter, healing any drift from crashes or non-atomic paths.
| What it costs | Why |
|---|---|
| Counter contention | Exact accounting touches one hot counter on every write — a serialization point that caps write throughput per tenant. |
| Exact vs approximate | Exact is always correct but contended; approximate (sharded or async tallies) is cheap but can briefly over- or under-count near the limit. |
| Reconciliation scan | Re-summing true usage means listing every object — O(objects) work, so it runs periodically (nightly), not per request. |
| Drift | Any gap between the counter and true on-disk usage — from crashes, races, or missed decrements — costs either over-charging tenants or letting them exceed the cap. |
| Signals to watch | used / limit ratio (warn band), reconciliation drift = counter − scanned, and the rejected-write rate (tenants hitting the wall). |
used = used + size) lets two writers read the same value and one update vanishes — a lost update that drifts the counter low.used -= size, or failed multipart parts that are never cleaned up, leaves phantom usage that slowly fills the quota with nothing.A tenant has a 10 GB limit. A run of writes brings used to 7 GB — past a 7.5 GB soft line we'd warn on soon. The tenant deletes a 2 GB object, so used drops to 5 GB and there's room to grow again.
But a buggy concurrent delete path skips its decrement, so the counter still reads high while real usage is lower — that gap is drift. A nightly reconciliation re-scans the actual objects, finds 5 GB of real data, and overwrites the counter back to 5 GB. The accounting is honest again, and the next write is judged against the true number.
1. Two writes of 1 GB each arrive at the same instant for a tenant at used = 8 GB, limit = 10 GB. The code does used = used + size (a plain read-then-write). What's the danger?
2. After crashes left the counter reading higher than what's actually stored, what fixes it?