Designing a pastebin

Take a big blob of text, hand back a tiny shareable link — and trade reads against writes.

The idea

A pastebin accepts a chunk of text (code, logs, a config) and returns a short URL anyone can open. The hard part is not storing the text — it's minting a short, unique key for each paste and serving it back fast, because reads vastly outnumber writes.

The clean design splits the two paths. On write, we save the blob and generate a base62 key. On read, we look the key up and stream the blob straight from object storage, ideally through a cache so the database barely gets touched.

See it work

Press play to follow a paste from text box to short link and back.

How it works

Generate the key from a monotonically increasing ID encoded in base62 (so it stays short and collision-free), or hash the content and keep the first few characters — retrying on the rare collision.

ALPHABET = "0..9a..zA..Z"  # 62 symbols

def base62(n):                 # 125 -> "21", 999999 -> "4c91"
    s = ""
    while n:
        n, r = divmod(n, 62)
        s = ALPHABET[r] + s
    return s or "0"

def create_paste(text):
    paste_id = db.next_id()             # atomic counter
    key = base62(paste_id)              # short, unique
    blob_store.put(key, text)           # the big bytes
    db.put(key, {"size": len(text)})   # tiny metadata row
    return f"https://pb/{key}"

def read_paste(key):
    if (hit := cache.get(key)):         # most reads end here
        return hit
    text = blob_store.get(key)
    cache.set(key, text, ttl=3600)
    return text

Cost

OperationWorkWhy
WriteO(1)One ID bump, one blob put, one metadata row
Read (cache hit)O(1)Served from memory, never reaches the database
Read (cache miss)O(1) + I/OOne blob fetch, then populate the cache
Key length~7 chars627 ≈ 3.5 trillion pastes

Watch out for

Worked example

You paste a 4 KB stack trace. The service bumps its counter to 3,500,000, encodes it as base62 → "efp4", writes the bytes to blob storage under that key and a tiny row to the database, then returns https://pb/efp4. A teammate opens the link: the first open misses the cache and reads from blob storage; every open for the next hour is a pure cache hit, so the database stays idle even under a thousand views.

Check yourself

Reads outnumber writes 100:1. Where should the read path spend most of its time?

Coach note: if this didn't click yet, replay the visual and watch which boxes light up on read versus write — the asymmetry is the whole design.