Geo-Partitioned Database

Keeping data physically close to the user who owns it to defeat the speed of light.

The idea

If your database is in New York, a user in Tokyo will always experience ~200ms of lag (latency) on every click, simply because light takes time to travel through underwater fiber-optic cables. You can't beat physics. To fix this, modern distributed databases (like CockroachDB or Google Spanner) use Geo-Partitioning. Instead of keeping all data on one server, the database seamlessly slices the tables up, physically moving Tokyo users' rows to a datacenter in Tokyo, and New York users' rows to New York. The application still talks to it like one giant database.

Step 1: A traditional database in New York. The Tokyo user suffers 200ms of latency on every request.

How it works (Row-level pinning)

Geo-partitioning works by choosing a Partition Key (like country_code). The database engine runs a background process that continuously monitors the data. When it sees a row where country_code = 'JP', it automatically migrates that specific row over the network to a hard drive physically located in Japan.

-- Example: CockroachDB Geo-Partitioning
-- We tell the DB to pin rows to regions based on the 'region' column.

ALTER TABLE users PARTITION BY LIST (region) (
    PARTITION us_east VALUES IN ('US'),
    PARTITION ap_northeast VALUES IN ('JP', 'KR')
);

-- The application just runs standard SQL:
-- SELECT * FROM users WHERE id = 123;
-- The DB engine automatically routes the query to the correct continent!

Cost

Geo-partitioning is magical for local reads, but terrible for global cross-partition queries. If you run SELECT count(*) FROM users, the database has to reach across the ocean to ask the Tokyo datacenter for its count, wait 200ms, combine it with the New York count, and return the result. Analytics queries become extremely slow. Furthermore, if a user flies from New York to Tokyo, their data is now on the wrong side of the planet until you update their region column.

Watch out for