Memory climbs toward a hard line. The moment it crosses, the kernel picks a victim and sends one quiet, fatal signal.
Every container runs under a hard memory limit — a Kubernetes pod limit, a cgroup ceiling, or just the physical RAM of the host. As your process allocates memory, its resident set (RSS) climbs toward that line. There is no graceful warning when it gets close.
When usage would cross the limit, the kernel's OOM killer chooses a victim by its oom_score and sends it SIGKILL. The process dies instantly — no cleanup, no stack trace — exiting with code 137 (that is 128 + 9, where 9 is SIGKILL). If something restarts it, it climbs and dies again: a CrashLoopBackOff.
A leak is just memory you allocate and never release. Here a request handler keeps appending to a module-level list, so RSS only grows — it never falls back down between requests.
cache = [] # module-level — never cleared
def handle(request):
# each request appends ~5 MB and keeps the reference,
# so the garbage collector can never reclaim it.
cache.append(load_batch(request)) # leak
return summarize(cache[-1])
When the cgroup's memory limit is exceeded, the kernel scans candidate tasks, scores each one (higher oom_score = bigger, less protected = more likely victim; oom_score_adj nudges it), and kills the worst. You see it after the fact:
# dmesg / journalctl on the node
Out of memory: Killed process 4127 (python) total-vm:812044kB,
anon-rss:524288kB ... oom_score_adj:0
# kubectl describe pod api-7f9c
Last State: Terminated
Reason: OOMKilled
Exit Code: 137 # 128 + 9 (SIGKILL)
State: Waiting
Reason: CrashLoopBackOff
| Remedy | Cost | Time to fix | Durability |
|---|---|---|---|
| Raise the limit | More RAM per pod, fewer pods per node | Minutes | None for a true leak — only delays the crash |
| Fix the leak | Engineering time, profiling | Hours to days | Permanent — RSS stops growing |
| Add backpressure | Bounded buffers, slower under load | Hours | Strong — bounds memory regardless of input size |
| Set requests < limits | Risk of eviction under node pressure | Minutes | Schedules safely, but bursting above request can still OOM |
137 is silent — SIGKILL can't be caught or logged, so there's no stack trace or shutdown hook. The only evidence is in dmesg / the pod's last state.oom_score, so a well-behaved large process can be the victim of a small leaker next to it.-Xmx. Direct buffers, threads, and JNI live outside the heap, so you can OOM with plenty of heap headroom.CrashLoopBackOff masks the root cause — it looks like a crashy app, but the real signal is Reason: OOMKilled in the previous termination, not the restart loop itself.A service is limited to 512 MB. A request batch loads 700 MB into memory at once. RSS climbs steadily and crosses 512 MB at roughly t = 8s. The kernel sends SIGKILL; the process exits 137 with Reason: OOMKilled.
The pod restarts, takes the same batch, and climbs into the same wall about every 30s — that repeating death is the CrashLoopBackOff you see in kubectl get pods. Three fixes, in order of durability: stream the batch (process it in chunks so peak RSS stays low), add a bounded buffer so input size can't dictate memory, or — as a stopgap — raise the limit above the 700 MB working set. Streaming is the real fix; raising the limit just moves the line.
A pod exits with code 137 and its previous state shows Reason: OOMKilled. What does that tell you?
You raise the pod's limit from 512 MB to 1 GB. It runs longer, then OOM-kills again. What's the likely cause?