Squeezing sorted keys that share prefixes

When neighbours in a sorted list start the same way, stop repeating the part they already agree on.

The idea

Inside a storage block, keys are kept sorted, and sorted neighbours tend to begin with the same bytes — think user:1001 right after user:1000. Rather than write every key out in full, store the length of the prefix a key shares with the one before it, plus only the suffix bytes that differ.

The first key in the block is kept whole as an anchor. To rebuild any later key you inherit the previous key's first shared bytes and append its stored suffix. Long, similar keys collapse to a couple of bytes apiece.

See it work

stored keys (sorted) on disk
Six sorted keys, each stored in full. Press play to compress them.

How it works

Encoding walks the sorted keys once. For each key it counts how many leading bytes match the previous key, then writes that count plus the leftover suffix. Decoding reverses it: take the previous key's first shared bytes and glue the suffix on.

# --- encode one block of SORTED keys ---
def encode(keys, restart_every=16):
    out = []
    prev = ""
    for i, key in enumerate(keys):
        # A restart point stores a full key (shared = 0) so a
        # lookup can begin decoding here without walking from row 0.
        if i % restart_every == 0:
            shared = 0
        else:
            shared = common_prefix_len(prev, key)
        out.append((shared, key[shared:]))   # (count, suffix bytes)
        prev = key
    return out

def common_prefix_len(a, b):
    n = 0
    while n < len(a) and n < len(b) and a[n] == b[n]:
        n += 1
    return n

# --- decode: rebuild key i by walking from the nearest restart ---
def decode(entries, i, restart_every=16):
    start = (i // restart_every) * restart_every   # nearest restart point
    key = ""
    for j in range(start, i + 1):
        shared, suffix = entries[j]
        key = key[:shared] + suffix                # inherit prefix, add suffix
    return key

Cost

PropertyCost
Space savedScales with how much adjacent keys overlap — huge for long, similar keys, near zero for short or random ones
Encode costOne prefix comparison per key — O(prefix length) per key, single pass over the block
Decode / random accessMust walk forward from the nearest restart point, replaying each entry — O(restart interval), not O(1)
Restart intervalSmaller interval = cheaper lookups but more full keys (less compression); larger interval = tighter block but longer walks
Signal it's paying offBlock / index size shrinks sharply; watch CPU spent rebuilding keys on hot lookups

Watch out for

Worked example

Picture a block of 1,000 sorted log-segment keys like eu-west-1/2026/06/26/host-abcd/segment-00042. Consecutive keys share a 30-plus byte prefix (region, date, host) and differ only in the trailing segment number.

Stored in full that's roughly 45 bytes × 1,000 ≈ 45 KB. With prefix compression most rows reduce to a one-byte shared count plus a 5-to-6 byte suffix — the block drops to a few KB. A lookup for one key seeks to the nearest restart point, then walks a handful of entries forward, rebuilding each key by inheriting its predecessor's prefix until it lands on the target.

Check yourself

1. Why does a compressed block need periodic restart points (full keys) instead of compressing every key against the one before it?

2. Prefix compression saves the most space when…