Missing Rate Limiting (API/Business Logic)

When resource consumption bankrupts your company.

The idea

While authentication rate-limiting prevents brute-forcing, API rate limiting prevents abuse of your business logic and infrastructure. If an API endpoint performs a heavy database query, sends an SMS, or calls a paid third-party service (like OpenAI), an attacker can hammer that endpoint thousands of times a second. This causes Denial of Service (DoS) by overloading your servers, or massive financial loss by running up your cloud computing bills.

Step 1: Normal Usage. A user requests data once, the API returns it.

How it works (Token Bucket / API Gateway)

API Rate limits are usually enforced at the Load Balancer or API Gateway level (e.g., AWS API Gateway, Nginx, Cloudflare), rather than inside the application code. This ensures malicious traffic is dropped before it even reaches your web servers.

# Nginx Rate Limiting Configuration Example

# Define a memory zone to track IP addresses. 
# Allow 10 requests per second.
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;

server {
    location /api/heavy-export {
        # Apply the limit. Burst allows a sudden spike of 5 requests,
        # but nodelay ensures they aren't artificially slowed down if within the limit.
        limit_req zone=mylimit burst=5 nodelay;
        
        proxy_pass http://backend_servers;
    }
}

Watch out for

Worked example

A startup builds an AI summary feature using a paid LLM API. They launch it with a button that triggers a POST request to `/api/summarize`. They forget to rate limit it. A troll discovers the endpoint, writes a bash script loop, and sends 500 requests a second. Over the weekend, the troll racks up a $45,000 API bill on the startup's account, bankrupting them.

Check yourself

When an API rate limit is exceeded, what is the standard HTTP status code that the server should return?

While a server under heavy load might return a 503, rate limiting is an intentional block of a specific client, not a server failure.
Correct! 429 is the standard code. It is often paired with a `Retry-After` header telling the client how many seconds to wait before trying again.