Separate the heavy thing from the words about it — bytes go to the warehouse, the index card goes in the drawer.
A team file share (Dropbox / Drive style) has two completely different jobs glued together. One is moving large opaque bytes — could be gigabytes. The other is answering small structured questions: who owns this, who can see it, what's it called, who has it shared. Forcing both through one path makes everything slow and fragile.
The clean design splits them. Bytes live in an object store, addressed by a content hash. A separate metadata service holds the file record, permissions, and the human-friendly name pointing at that hash. Downloads check permission in metadata, then hand the client a short-lived signed URL so the bytes flow straight from the object store — the app server never touches the payload.
Upload writes bytes to the object store (keyed by their hash), then records a small metadata row. Sharing flips a permission, not a copy. Download is permission-check, then a signed URL so bytes never pass through the app tier.
# Upload: bytes to store, then a tiny metadata record
def upload(user, name, blob):
digest = sha256(blob)
object_store.put_if_absent(digest, blob) # dedup: same bytes stored once
return metadata.create(owner=user, name=name, hash=digest)
# Share: a permission grant, not a file copy
def share(file_id, with_user, role="viewer"):
metadata.add_grant(file_id, with_user, role)
# Download: check access in metadata, hand back a signed URL
def download(reader, file_id):
rec = metadata.get(file_id)
if not metadata.can_read(reader, rec):
raise Forbidden()
return object_store.signed_url(rec.hash, ttl=300) # bytes flow direct
Because objects are content-addressed, two people uploading the identical file store it once and share a single blob — permissions stay separate per metadata record.
| Operation | Cost | Note |
|---|---|---|
| Upload bytes | O(file size) | Direct to object store |
| Create / share record | O(1) | Small metadata write |
| Permission check | O(1) indexed | Per download |
| Download bytes | O(file size) | Store → client, app tier idle |
| Duplicate file | O(1) storage | Content-hash dedup |
Ada uploads plan.pdf (40 MB). Its bytes go to the object store under hash a91f…; metadata records owner=Ada, name=plan.pdf, hash=a91f. She shares it with Bo as a viewer — one row added, zero bytes copied. Bo clicks download: metadata confirms he's a viewer and returns a 5-minute signed URL; the 40 MB flow from the object store straight to Bo. If Cy later uploads the same PDF, the store sees the hash already exists and keeps a single copy — Cy still gets his own private metadata record.
On download, why return a signed URL instead of streaming the file through the app server?