Question
Design a high-throughput gRPC ingestion service that accepts bidirectional streaming telemetry from 200k edge agents, each pushing batched metrics at up to 50k messages/sec aggregate. The service must apply backpressure when downstream storage is slow (rather than OOMing), keep p99 RPC latency bounded under load, fairly multiplex many streams over HTTP/2 connections without one heavy client starving others, and shed load gracefully during a downstream slowdown. Walk through the flow-control model, connection/stream handling, and the central trade-off.
Clarify scale and constraints first. Propose a clean component breakdown, then go deep on the hard parts — data model, bottlenecks, consistency, failure modes — and name the trade-offs you are making.