Code Room
On-callHardoc-g363
Subject Secret rotationLevel Senior–Staff~40 minCommon in Security · Distributed systems interviewsIndustries Software development, Technology

Question

Responding to a suspected leak, at 14:00 you rotate a shared HMAC webhook-signing secret in the vault and **revoke the old value**. Within 90 seconds, webhook verification across 30+ consumer services starts failing — signatures computed with the *new* secret are rejected by consumers still holding the *old* one cached in memory (they only re-read the secret on restart, which can be hours). Inbound partner webhooks are now being dropped. You need the leak contained AND traffic restored. Walk through how you triage, restore service, and run a rotation that doesn't cause this — without leaving the leaked secret valid longer than necessary.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Diagram & narrate the incident
Loading whiteboard…
Run or narrate your approach, then ask the coach.