Question
Design refresh-token rotation with theft detection for a 60M-user SPA + mobile product. Access tokens are short (10 min); refresh tokens are long-lived and rotate on every use (each refresh returns a new refresh token and invalidates the old). The security goal: if a refresh token is stolen and the attacker uses it, you must DETECT the theft (because now two parties hold tokens in the same rotation family) and shut down the whole family. Discuss the token-family data model, how reuse detection works, what happens on a legitimate race (a flaky network causing a double-refresh), and the storage/scale implications of tracking rotation chains.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.