Question
A large data platform has ~12k tables, ~3k dbt/Spark jobs, and dozens of BI dashboards. When an upstream column is renamed or a source feed breaks, nobody can tell which downstream tables and executive dashboards are affected until something breaks in a board meeting. Design a data-lineage system that captures table- and column-level lineage automatically, lets an engineer do impact analysis ('if I change this column, what breaks'), and powers freshness/incident alerting. Cover capture, storage, and how it stays accurate as pipelines change daily.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.