Hybrid Logical Clocks

How distributed databases figure out which event happened first when their wall-clocks are out of sync.

The idea

In a distributed database, if Server A writes "x=5" and Server B writes "x=10", how do we know which write happened last so we can resolve the conflict? We can't trust physical timestamps (Wall Clocks) because Server B's clock might be 5 milliseconds behind Server A's. We could use a purely Logical Clock (a counter like 1, 2, 3), but humans like querying by real time (e.g. "Select where time > 12:00"). Hybrid Logical Clocks (HLC) combine both: they use the physical timestamp, but append a logical counter to tie-break events that happen in the same millisecond, ensuring perfect causal ordering.

Step 1: Wall Clock Skew. Server B's clock is lagging by 20ms. It writes 'x=10' AFTER Server A wrote 'x=5', but its physical timestamp is older!

How it works (Physical Time + Counter)

An HLC is typically a 64-bit integer. The first 48 bits store the physical timestamp (e.g. milliseconds since epoch), and the last 16 bits store a logical counter. When a server receives a message from another server, it updates its HLC to be the maximum of its own physical clock, its own HLC, and the incoming message's HLC. If the physical times are identical (or if the clock went backwards), it increments the counter part.

// Updating a Hybrid Logical Clock on receiving a message
function updateHLC(localTime, localHLC, messageHLC) {
    let nextTime = Math.max(localTime, localHLC.physicalTime, messageHLC.physicalTime);
    let nextCounter = 0;
    
    if (nextTime === localHLC.physicalTime && nextTime === messageHLC.physicalTime) {
        // Both clocks stuck on same millisecond. Increment counter to force order.
        nextCounter = Math.max(localHLC.counter, messageHLC.counter) + 1;
    } else if (nextTime === localHLC.physicalTime) {
        nextCounter = localHLC.counter + 1;
    } else if (nextTime === messageHLC.physicalTime) {
        nextCounter = messageHLC.counter + 1;
    }
    
    return { physicalTime: nextTime, counter: nextCounter };
}

Cost

HLCs allow databases like CockroachDB and Cassandra to achieve serializable consistency without requiring specialized atomic hardware clocks (like Google Spanner's TrueTime). However, HLCs still require NTP (Network Time Protocol) to keep physical clocks relatively close. If a server's physical clock drifts too far (e.g., 500ms out of sync), the database will usually forcefully kill the node to prevent causal anomalies from stretching the logical counters too far.

Watch out for

Clock Moving Backwards: Physical clocks routinely jump backwards (e.g., when NTP corrects a fast clock). If a server just wrote an event at 100ms, and the physical clock jumps back to 90ms, the HLC algorithm ignores the 90ms. It keeps the HLC at 100ms and just increments the logical counter (100.1, 100.2) until physical time catches back up to 101ms.