Hand every long URL a short unique nickname, remember the mapping, and bounce visitors from the nickname back to the original.
To shorten a link, mint a unique id for it — the next value from a counter, or a hash of the URL — then base62-encode that number into a compact key like xR7q. Persist key → URL in a fast key-value store so the lookup is a single point read.
To resolve a short link, take the key off the path, look it up, and answer with an HTTP redirect to the original URL. base62 uses the digits 0-9a-zA-Z, so just 6 characters address 626 ≈ 56 billion distinct keys — plenty of room while staying tweet-short.
A monotonic counter (or a snowflake-style id service) hands out the next id; base62 turns it into the shortest key over 0-9a-zA-Z. Writes put the mapping; reads get it and emit a redirect. The store is read-heavy, so a cache in front of it absorbs the hot keys.
# base62 over digits 0-9 a-z A-Z (index 0..61)
ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def base62(n):
if n == 0:
return "0"
out = []
while n > 0:
out.append(ALPHABET[n % 62]) # least-significant digit first
n //= 62
return "".join(reversed(out)) # 125 -> "21"
def shorten(long_url, store, counter):
n = counter.next() # unique id, e.g. 125
key = base62(n) # "21"
store.put(key, long_url) # key -> URL
return "https://sho.rt/" + key
def resolve(key, store):
long_url = store.get(key) # single point read
if long_url is None:
return 404
return redirect(long_url, status=301)
| Choice | Trade-off |
|---|---|
| Key length | 62n keys in n chars: 4 chars ≈ 14.8M, 6 chars ≈ 56B. Shorter links, smaller namespace. |
| Counter vs hash | Counter is collision-free but predictable; hashing is opaque but needs collision retries (or a longer key). |
| Read-heavy | Resolves vastly outnumber shortens — front the store with a cache; the mapping is immutable, so it caches well. |
| 301 vs 302 | 301 permanent is cached by browsers and proxies (fast, but hard to revoke); 302 stays revocable and lets you count clicks. |
302 when you need control.Shorten a URL and the counter returns id 125. Encode it: take the remainder digits least-significant first, so 125 % 62 = 1 → digit 1, then 125 // 62 = 2 and 2 % 62 = 2 → digit 2 (and 2 // 62 = 0, so we stop). Reading the digits most-significant first gives 2 then 1; both map to the literal characters '2' and '1' in the alphabet, so the key is "21" — that is, 125 = 2×62 + 1. We store 21 → URL. Later GET /21 looks up the key, finds the URL, and replies 301 redirecting the visitor to the original site.
1. Why base62-encode the id instead of using the decimal id in the path?
2. You need to revoke or repoint short links later. Which redirect should the resolver return?
Coach note: if a pick doesn't land, give it another pass — the reasoning is what sticks.