Question
A migration adds a new value to a Postgres native ENUM type (`order_status`) so the new code can write status `partially_refunded`. The migration runs fine and is fast. The new code rolls out and starts writing `partially_refunded`. But a separate, OLDER reporting service (deployed months ago, not part of this rollout) begins throwing `invalid input value for enum order_status: "partially_refunded"` on a subset of rows, and its nightly export fails. Dashboards: the primary app is healthy; only the reporting service errors, and only on rows written after the migration. The reporting service uses an ORM that loaded the enum's allowed values at process start and validates reads against that cached set. Triage, explain, and mitigate.
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.