Code Room
System designHardsd-g364
Subject Model monitoringLevel Senior–Staff~45 minCommon in Reliability & on-call interviewsIndustries Technology, Software development

Question

Design the production-quality monitoring system for a customer-facing LLM agent (refunds, account changes) where there is no clean numeric label for 'was this answer correct' and a wrong action can cost money or break trust. Volume is millions of conversations a day, so you cannot human-review them all. Walk through what quality signals you can actually measure online, how you'd use automated graders/LLM-judges responsibly, and how you catch a quality regression caused by a prompt or model change before it spreads.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.