Time-series database

A database tuned for one shape of data: a metric, a timestamp, and a relentless stream of new points that only ever moves forward.

The idea

Imagine a sensor reporting CPU usage every second. You never update old readings — you only append new ones, and you almost always ask the same kind of question: "what did this look like over the last hour?"

A time-series database (TSDB) leans into that pattern. It groups points into time-ordered partitions (one chunk per hour or per day), keeps raw points only for a while, then rolls them up into coarser summaries so old data stays cheap. A range query then touches just the partitions it needs.

Query window (hours):

Press play to watch points stream in, roll up, and get queried.

How it works

Writes land in the newest partition in arrival order, so an append is just "add to the end." A background job compacts older partitions into rollups (min / max / avg / count per bucket), and a retention policy drops raw points past their age. A query picks the partitions overlapping its window and reads raw or rolled-up data depending on how old it is.

def write(point):                 # append-only, newest partition
    p = partition_for(point.ts)   # e.g. floor to the hour
    p.raw.append(point)           # O(1) amortised

def query(metric, start, end):
    out = []
    for p in partitions_overlapping(start, end):
        if p.has_raw():           # young data: full resolution
            out += [x for x in p.raw if start <= x.ts < end]
        else:                     # old data: read the rollup
            out += p.rollup.buckets_in(start, end)
    return out

def compact(p):                   # background, runs on old partitions
    p.rollup = downsample(p.raw)  # min/max/avg/count per bucket
    p.raw = []                    # reclaim space

Cost

Operation	Cost
Append a point	O(1) amortised
Range query	O(partitions + points read)
Storage for old data	O(buckets) after rollup

The trade-off: rollups make old reads fast and cheap, but you lose per-point detail. Once raw points are compacted you can see the shape of last month, not every single sample.

Watch out for

Out-of-order and late points. Network buffering means a point timestamped 10:59 can arrive at 11:02. If the 10:00 partition is already compacted, that late point has nowhere good to land — plan a grace window before rollup.
High cardinality. Each unique tag combination (host, region, endpoint…) is its own series. A label like user_id can explode you into millions of series and blow up memory and index size.
Wide range queries on raw data. Asking for raw points across a year reads enormous volume. Make sure the query planner drops to rollups for old windows automatically.
Forgetting retention. Append-only growth is unbounded. Without a retention policy that deletes or downsamples, disk fills and writes stall.
Averaging averages. A rollup's avg can't be re-averaged across buckets without weighting by count, or your aggregate quietly lies.

Worked example

You collect cpu.usage per host every 10 seconds across 200 hosts. Raw, that's 1.7M points per host per day. You keep 7 days raw for incident debugging, then roll up to 1-minute buckets (min/max/avg/count) for 90 days, then 1-hour buckets for two years. A dashboard asking "p95-ish CPU yesterday" reads minute rollups — a few thousand buckets — instead of millions of raw points, and renders instantly.

Check yourself

A query asks for raw, per-second data from 18 months ago. What likely happens?