Code Room
System designMediumsd-g429
Subject LakehouseLevel Mid–Senior~35 minCommon in Distributed systems interviewsIndustries Technology, Software development

Question

A lakehouse table (Iceberg/Delta on S3) ingests near-real-time via tiny streaming micro-batch commits every few seconds, producing millions of small files and tens of thousands of metadata/snapshot entries per day. Trino queries that used to take seconds now take minutes, query planning alone is slow, S3 LIST/GET costs have ballooned, and writers occasionally hit commit conflicts. Design the table maintenance and layout strategy to restore fast queries without sacrificing the few-second ingest freshness.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.