Question
Design the downsampling and rollup engine for a metrics time-series database. Raw points arrive at 1s resolution for ~80M active series; dashboards query 'last 5 minutes' at raw resolution but 'last 90 days' must return in <500ms, which is impossible to scan from raw. You must continuously produce pre-aggregated rollups (1m, 1h, 1d) carrying min/max/sum/count/last per window so any percentile-free aggregation is answerable, route each query to the coarsest resolution that satisfies it, and expire raw data after 7 days while keeping 1h/1d rollups for 2 years. How do you compute rollups incrementally without rescanning, keep them correct under late-arriving points, and pick resolution at query time?
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.