Question
Design the feature-store backbone for a fraud-scoring system that needs graph-based features — e.g. 'how many distinct accounts share this device/IP/payment instrument in the last 24h' and 'is this account 2 hops from a known-fraud ring'. These linked-entity features must be available to a real-time scorer within a 60ms budget at 5,000 scores/sec, kept reasonably fresh as new links form, and reproducible offline for training. The entity graph has hundreds of millions of nodes and grows continuously; some queries (multi-hop) are expensive and must not blow the latency budget on the hot path.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.