Question
An order service writes to its own Postgres and must also reliably publish an 'order changed' event to Kafka so the analytics pipeline and a downstream fulfillment service stay in sync. Today it does a dual write — commit the DB row, then publish to Kafka — and under crashes/timeouts the two diverge: orders exist with no event (lost downstream) or events fire for transactions that rolled back (phantom orders downstream). Design a reliable, ordered publish that survives crashes, with at-least-once delivery and a path to no duplicates downstream.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.