Question
A self-managed search node on Kubernetes (memory limit 16Gi, swap enabled on the node) starts seeing query p99 climb from 60ms to 8s during the daily index-merge window, with the pod hovering just under its 16Gi limit the whole time and never OOMKilled. Node `vmstat` shows si/so spiking and high major-page-fault rate; CPU is mostly iowait. The engine memory-maps its index segments and relies on the OS page cache. A recent change raised the JVM/heap allocation inside the container, leaving much less room under the cgroup limit for the mmap'd page cache. Triage and mitigate.
Stop the bleeding first (mitigate), then form hypotheses from real signals. Separate root cause from symptom, communicate status as you go, and close with what prevents a repeat.