Code Room
On-callMediumoc-g166
Subject Message lossLevel Mid–Senior~30 minCommon in Networking & APIs · Distributed systems interviewsIndustries Technology, Software development

Question

A billing pipeline consumes a Pub/Sub subscription and writes invoice line-items to Postgres. Finance reports two problems this week: (1) a few customers were *double-charged*, and (2) one batch of charges appears *missing*. Dashboards: the subscription's `ack_message_count` looks healthy, `oldest_unacked_message_age` is low. Recent context: the consumer was changed three days ago to ack the Pub/Sub message *first* and then write to Postgres (to 'reduce ack-deadline expirations'), and the Postgres write has no uniqueness constraint on `(invoice_id, line_item_id)`. There was a brief Postgres failover two days ago. Triage and explain both the duplicates and the loss.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.