How distributed replicas gossip to heal silent data divergence.
In eventually consistent systems (like DynamoDB or Cassandra), network partitions or dropped packets can cause one database replica to miss a write. "Anti-entropy" is a background process where replicas randomly pair up, compare their data, and resolve differences to ensure they eventually converge to the exact same state.
Comparing terabytes of data over the network is too slow. Instead, databases use a Merkle Tree (a tree of hashes). Replicas only compare the root hash. If it differs, they traverse down the tree to find exactly which blocks of data differ, transferring only the missing data.
# Simplified Anti-Entropy Round
def anti_entropy_round(node1, node2):
# 1. Compare root hashes (O(1) network cost)
if node1.merkle_tree.root_hash() == node2.merkle_tree.root_hash():
return # Perfect sync!
# 2. Find differences by traversing the tree
differing_keys = find_mismatched_leaves(node1.merkle_tree, node2.merkle_tree)
# 3. Exchange data for only those keys
for key in differing_keys:
val1, timestamp1 = node1.get(key)
val2, timestamp2 = node2.get(key)
# Last-Write-Wins (LWW) resolution
if timestamp1 > timestamp2:
node2.put(key, val1, timestamp1)
else:
node1.put(key, val2, timestamp2)
Time Complexity: O(log N) network rounds to find a mismatched block of data using a Merkle tree, where N is the number of data blocks. Space Complexity: O(N) to store the Merkle tree hashes in memory/disk.