Analytics Architecture

Separating operational read/writes (OLTP) from heavy analytical queries (OLAP).

The idea

Operational databases (like Postgres or MySQL) are optimized for fast, single-row lookups (e.g. "Get user 123's cart"). Analytics requires scanning millions of rows to compute aggregates (e.g. "Total revenue per day this year"). Running analytics on the operational database will crash it. We must stream data out into a dedicated Data Warehouse (OLAP).

Step 1: Production Database handles fast, tiny transactions.

How it works

# Typical Analytics Pipeline (ETL / ELT)

# 1. Extract (CDC or Batch)
events = database.read_change_stream()

# 2. Transform (Clean, aggregate)
clean_events = [e for e in events if e.is_valid()]

# 3. Load (Insert into Data Warehouse)
data_warehouse.bulk_insert(clean_events)

# 4. Analyst queries the Warehouse (NOT production)
# SELECT COUNT(*), date FROM sales GROUP BY date;

Cost

Data Warehouses use Columnar Storage, meaning they store all values of a single column contiguously. This makes aggregating a single column lightning fast, but updating individual rows very slow.

Watch out for