Splitting a giant database across multiple servers to handle more data and traffic.
When a database gets too big for one machine, we split it into pieces called Shards (or Partitions). But how do we decide which row goes to which shard?
If we use Range Partitioning (e.g., A-F goes to Shard 1, G-M to Shard 2), scanning is fast, but we risk a "Hot Partition" if everyone suddenly queries 'C'. If we use Hash Partitioning (hash(key) % N), data is evenly spread out, but range queries become impossible because adjacent keys live on different servers.
def route_query(key, strategy="hash"):
if strategy == "range":
# Fast for ranges, but risks hot spots if keys are sequential
if key < 'G': return shard_1
if key < 'N': return shard_2
return shard_3
elif strategy == "hash":
# Randomizes data evenly, but range queries require
# querying ALL shards (Scatter-Gather).
server_index = hash(key) % 3
return shards[server_index]