System designHardsd-g288

Subject Graph db storageLevel Senior–Staff~50 minCommon in Storage & CDN · Algorithms & data structures interviewsIndustries Technology, Software development

Question

Design the on-disk storage layer of a native graph database engine optimized for traversal, not the partitioning across machines. A single node holds ~2B vertices and ~40B directed edges with properties; the dominant workload is multi-hop traversals ('expand all out-edges of v, then their out-edges') that must touch millions of edges with as few random I/Os as possible, alongside transactional inserts/deletes of vertices and edges. Property-graph semantics: typed edges, properties on both vertices and edges, and a handful of supernodes with tens of millions of edges. How do you physically lay out vertices, edges, and adjacency so a hop is close to a sequential read, while still supporting fast point deletes?

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Learn the concepts

Narrate your design

Loading whiteboard…

Run or narrate your approach, then ask the coach.