When setting resources on Kubernetes pods, I’m finding it very difficult to achieve good memory efficiency.
I’m using the “no CPU limits, set memory=limits” philosophy that I hear heavily recommended on the internet.
The problem is, that many pods will have random memory spikes. In order for them not to be OOM Killed, you have to set the memory requests for them above their highest spike, which means most of the time they’re only using like 25% or so of their memory allocation.
I’ve been trying to optimize this one cluster, and on average I’m only getting 33% of the total memory requested for all the pods in the cluster actually being used. Whenever I try decreasing some pod’s memory requests, I eventually get OOMs. I was hoping I could reach closer to 50%, considering that this particular cluster has a stable workload.
I’m sure that I could optimize it a bit better, but not by much.
Is this a shared experience in Kubernetes? That you ultimately have to sacrifice a lot of memory efficiency.
Others are correct, the problem is the software. You are right to use memory requests and limits. The limits being the max it will use, but hopefully other pods won’t be using all of their limits at once.
So all of the pods’ memory requests on a given node will sum to < 100% of the total available memory. So you can of course say your pod requests the highest amount of ram it will ever need, but that does mean it’s reserved for that pod and won’t be used anywhere else even during downtime
K8s will allow over provisioning of ram for the limits though because it assumes it will not always need that as you are seeing.
What you can do is to set a priority class on the pod so when it spikes and you don’t have enough ram, it will kill some other pod instead of yours, but that makes other pods more volatile of course.
There’s many options at your disposal, you’ll have to decide what works best for your use case.
That seems like a problem of the application, no? If the workloads have memory leaks or are too eager to get memory to itself, then no cluster will be able to make it perform better.
https://home.robusta.dev/blog/stop-using-cpu-limits
Okay, it’s actually more complex than that. Because on self managed nodes, kubernetes is not the only thing that’s running, so it can make sense to set limits for other non kubernetes workloads hosted on those nodes. And memory is a bit different from CPU. You will have to do some testing and YMMV but just keep the difference between requests and limits in mind.
But my suggestion would be to try to see if you can get away with only setting requests, or with setting high very high limits. See: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#if-you-do-not-specify-a-memory-limit
In order for them not to be OOM Killed, you have to set the memory requests for them above their highest spike, which means most of the time they’re only using like 25% or so of their memory allocation.
Are you sure? Only limits should limit the total memory usage of a pod? Requests should happily let pods use more memory than the request size.
One thing I am curious about is if your pods actually need that much memory. I have heard (horror) stories, where people had an application in Kubernetes with a memory leak, so what they did instead of fixing the memory leak, was to just regularly kill pods and restart new ones that weren’t leaking yet. :/
To answer your actual question about memory optimization, no. Even google still “wastes” memory by having requests and limits higher than what pods usually use. It is very difficult to prune and be ultra efficient. If an outage due to OOM costs more than paying for more resources would, then people just resort to the latter.



