When setting resources on Kubernetes pods, I’m finding it very difficult to achieve good memory efficiency.

I’m using the “no CPU limits, set memory=limits” philosophy that I hear heavily recommended on the internet.

The problem is, that many pods will have random memory spikes. In order for them not to be OOM Killed, you have to set the memory requests for them above their highest spike, which means most of the time they’re only using like 25% or so of their memory allocation.

I’ve been trying to optimize this one cluster, and on average I’m only getting 33% of the total memory requested for all the pods in the cluster actually being used. Whenever I try decreasing some pod’s memory requests, I eventually get OOMs. I was hoping I could reach closer to 50%, considering that this particular cluster has a stable workload.

I’m sure that I could optimize it a bit better, but not by much.

Is this a shared experience in Kubernetes? That you ultimately have to sacrifice a lot of memory efficiency.

  • moonpiedumplings@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    3 days ago

    https://home.robusta.dev/blog/stop-using-cpu-limits

    Okay, it’s actually more complex than that. Because on self managed nodes, kubernetes is not the only thing that’s running, so it can make sense to set limits for other non kubernetes workloads hosted on those nodes. And memory is a bit different from CPU. You will have to do some testing and YMMV but just keep the difference between requests and limits in mind.

    But my suggestion would be to try to see if you can get away with only setting requests, or with setting high very high limits. See: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#if-you-do-not-specify-a-memory-limit

    In order for them not to be OOM Killed, you have to set the memory requests for them above their highest spike, which means most of the time they’re only using like 25% or so of their memory allocation.

    Are you sure? Only limits should limit the total memory usage of a pod? Requests should happily let pods use more memory than the request size.

    One thing I am curious about is if your pods actually need that much memory. I have heard (horror) stories, where people had an application in Kubernetes with a memory leak, so what they did instead of fixing the memory leak, was to just regularly kill pods and restart new ones that weren’t leaking yet. :/

    To answer your actual question about memory optimization, no. Even google still “wastes” memory by having requests and limits higher than what pods usually use. It is very difficult to prune and be ultra efficient. If an outage due to OOM costs more than paying for more resources would, then people just resort to the latter.