Storage engines (LSM & B-Tree)

The data structures databases use under the hood to store and retrieve bytes on disk.

The idea

A B-Tree (used in PostgreSQL, MySQL) is read-optimised. It updates data in-place on the disk. When a page gets full, it has to do an expensive "Page Split", causing high Write Amplification.

An LSM Tree (used in Cassandra, RocksDB) is write-optimised. It NEVER updates in-place. It just appends writes to a fast in-memory buffer (Memtable). When full, it flushes to an immutable file on disk (SSTable). Later, a background "Compaction" process merges files to clean up old data.

B-Tree (In-Place) Disk Writes: 0 LSM Tree (Append-Only) Disk Writes: 0
Both engines have space. Click to insert a row.

How it works (Trade-offs)

# B-Tree:
# Reads are fast O(log N).
# Writes are slow because finding the spot takes time, and
# updating in the middle of a block causes disk fragmentation.

# LSM Tree:
# Writes are insanely fast O(1) (just append to memory/log).
# Reads are slower (have to check Memtable, then multiple SSTables).
# Needs CPU/Disk for background Compaction.