On-callHardoc-g193

Subject Memory leakLevel Senior–Staff~40 minCommon in Algorithms & data structures interviewsIndustries Software development

Question

A Java gRPC service (Netty transport) gets OOMKilled by Kubernetes every ~18 hours despite a stable, healthy JVM heap: heap-used oscillates around 2GB under an -Xmx4g and never trends up, full GCs are rare, but container RSS climbs steadily to the 8Gi limit and the pod dies with exit 137. The Netty 'usedDirectMemory' metric trends upward over the life of the pod. A streaming-response feature shipped two weeks ago. Triage and fix.

What a strong answer looks like

Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.

Learn the concepts

Diagram & narrate the incident

Loading whiteboard…

Run or narrate your approach, then ask the coach.