Code Room
System designMediumsd-g194
Subject Schema evolutionLevel Mid–Senior~40 minCommon in Databases & SQL interviewsIndustries Technology, Software development

Question

A data lake stores 5 years of event data as Parquet in a lakehouse table queried by Spark and Trino. Product needs ongoing schema changes: add columns, rename a column, change a column's type (int→bigint, string→struct), and reorder columns — without rewriting all 5 years of files and without breaking old queries or jobs that read older partitions. Files written under old schemas must still be readable alongside new ones. Design how the table handles schema evolution so reads stay correct across schema versions.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.