Question
You're building the rate limiter for a multi-tenant API gateway. Each customer has a plan limit (e.g. 1000 req/min) enforced across a fleet of 30 stateless gateway nodes behind a load balancer. Compare a token-bucket vs a sliding-window-log vs a sliding-window-counter approach for this case, then design the distributed enforcement: where the counter state lives, how you keep per-request latency under 1ms of added overhead, and how you handle burst allowance fairly without letting one customer get 30x their limit by hitting 30 nodes. Specify the consistency model you accept.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.