SQL Data Access

Extracting exactly what you need, nothing more, nothing less.

The idea

Modern applications interact with relational databases using SQL (Structured Query Language). A poorly optimized SQL query can bring down an entire web server by consuming too much memory or locking critical tables. Efficient data access means understanding how the database engine executes your queries.

Step 1: SELECT * FROM users. The database pulls 50 columns for 100,000 rows.

How it works (The N+1 Query Problem)

A common mistake when using ORMs (Object Relational Mappers) is fetching a list of items, and then looping through them to fetch related data, resulting in 1 query + N additional queries.

# VULNERABLE: The N+1 Problem (101 Database Queries)
users = User.objects.all() # 1 Query
for user in users:
    print(user.profile.bio) # 100 Queries! (1 for each user)

# SECURE: Eager Loading (1 Database Query)
# Uses a SQL JOIN under the hood
users = User.objects.select_related('profile').all() 
for user in users:
    print(user.profile.bio) # 0 Queries! Data is already in memory.

Watch out for

`SELECT *`: Never use `SELECT *` in production code. It transfers massive amounts of unused data over the network and wastes database memory. Select only the columns you need.
Missing Indexes on JOINs: If you `JOIN` two tables on a column that isn't indexed, the database performs a nested loop join, which is `O(N*M)` complexity and will time out on large datasets.
Long-Running Transactions: Opening a transaction, updating a row, and then making a slow HTTP request before committing will hold an exclusive lock on that row, blocking all other users.

Worked example

An API endpoint `GET /api/orders` takes 10 seconds to load. You look at the logs and see it executing 5,000 separate SQL queries. The ORM is fetching the orders, and then running a separate `SELECT` query for each order's shipping address. By changing the code to `orders = db.query(Order).joinedload(Address)`, it drops to a single query taking 40ms.

Check yourself

Why is it dangerous to run `SELECT count(*) FROM massive_table` in Postgres?

Correct! Because different transactions see different snapshots of the database, Postgres cannot keep a single global row count. It must scan the table.

Not quite — counting rows acquires a shared read lock, not an exclusive write lock. It doesn't block INSERTs, but it's incredibly slow.