Question
You're migrating a critical revenue-reporting pipeline from a legacy SQL/cron system to a new framework (e.g. Spark/dbt on a lakehouse). The new pipeline must produce numbers that exactly match the old one before you can cut over, and you must backfill 3 years of history into the new system. Finance signs off only after proving the two produce identical results. The legacy system is still the source of truth and still running daily. Design the migration: how you backfill history, how you validate equivalence, and how you cut over with zero reporting gap and a rollback path.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.