Hot Partitions & Sharding

When you buy 10 servers, but all the traffic goes to just one.

The idea

When a database gets too big for one machine, you split it across multiple servers (Sharding). You have to pick a "Partition Key" to decide which data goes where. If you build a Twitter clone and shard by user_id, most servers handle normal users perfectly. But what happens when Taylor Swift (a single user_id) posts a tweet? Millions of fans request her data simultaneously. Because she is just one user, ALL of that traffic is routed to exactly ONE shard! That server catches fire (a "Hot Partition") while the other 9 servers sit totally idle.

Step 1: A 3-node cluster. Traffic is evenly balanced between normal users.

How it works (Salting & Scatter-Gather)

To fix a hot partition, you must change how you shard the data. Instead of just user_id, you "salt" the key by appending a random number (e.g., user_id + random(1 to 10)). This forces Taylor Swift's tweets to be randomly spread across ALL 10 servers. When you need to read her tweets, you must query all 10 servers and merge the results (Scatter-Gather).

# The BAD way (Hot Partition)
PartitionKey = "TaylorSwift" 
# Hash("TaylorSwift") always equals Shard 2! 

# The GOOD way (Write-Salting)
import random
salt = random.randint(1, 10)
PartitionKey = f"TaylorSwift_{salt}"
# Spreads writes evenly across Shard 1 through 10! ✅

# To Read:
# You must query TaylorSwift_1, TaylorSwift_2... up to 10.

Cost

Salting fixes the write bottleneck, but makes reading much harder. A query that used to hit 1 server now has to hit 10 servers in parallel, sort the results in memory on the application side, and return them. This consumes far more total cluster CPU.

Watch out for