Database Query Optimizer

How the database decides the fastest way to fetch your data.

The idea

SQL is a declarative language. You tell the database what you want, but not how to get it. When you submit a query like SELECT * FROM users JOIN orders, the Database's Query Optimizer calculates dozens of possible Execution Plans. Should it scan the users table first? Should it use an index? Should it do a Hash Join or a Nested Loop? It estimates the CPU and Disk "cost" of each plan using statistics it keeps about the tables, and then executes the cheapest one.

Step 1: A query arrives. We want Alice's orders.

How it works (Cost-Based Optimization)

Modern databases use a Cost-Based Optimizer (CBO). It looks at statistics like "How many rows are in the orders table?" and "How many unique names are in the users table?". It calculates an arbitrary cost number for different strategies. For example, a Sequential Scan has a high cost if the table is huge, but a very low cost if the table is tiny.

-- Asking Postgres to show us the Optimizer's plan (EXPLAIN)
EXPLAIN SELECT * FROM users u 
JOIN orders o ON u.id = o.user_id 
WHERE u.name = 'Alice';

-- Output looks like:
-- Hash Join  (cost=12.50..45.20 rows=10)
--   -> Seq Scan on orders  (cost=0.00..20.00 rows=1000)
--   -> Hash  (cost=8.30..8.30 rows=1)
--        -> Index Scan using users_name_idx (cost=0.00..8.30)

Cost

The Optimizer itself takes CPU time to run! If a query has 15 JOINs, there are billions of possible execution plans. Evaluating them all would take longer than running the query. Optimizers use heuristics and limits to stop searching once they find a "good enough" plan. You can force a specific plan by restructuring the query or using hints, but it's usually better to let the Optimizer do its job.

Watch out for

Stale Statistics: The Optimizer relies on ANALYZE statistics. If you bulk-insert 10 million rows and don't update statistics, the Optimizer still thinks the table is empty! It might choose a terrible plan (like a Nested Loop) that takes 3 hours, when a Hash Join would take 3 seconds. Always run ANALYZE after bulk data changes.