Code Room
System designHardsd-g162
Subject Telemetry pipelineLevel Senior–Staff~45 minCommon in Distributed systems interviewsIndustries Technology

Question

Design a general-purpose telemetry collection pipeline (OpenTelemetry-style) that receives metrics, logs, and traces from 100k agents totaling 4GB/s, normalizes/enriches them, and fans them out to multiple backends (a metrics TSDB, a log store, a trace store, and a cold archive in object storage). One backend (the trace store) is regularly slow or briefly down. The pipeline must not lose data when a backend is degraded and must not let a slow backend stall ingestion of the others. Design the stages, buffering, and per-backend isolation.

What a strong answer looks like

Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.

Narrate your design
Loading whiteboard…
Run or narrate your approach, then ask the coach.