In Kubernetes, what should I use as memory requests and limits?
And what happens when you don't set them?
Let's dive into it.
In Kubernetes, you have two ways to specify how much memory a pod can use:
- "Requests" are usually used to determine the average consumption.
- "Limits" set the max number of resources allowed.
The Kubernetes scheduler uses requests to determine where the pod should be allocated in the cluster.
Since the scheduler doesn't know the consumption (the pod hasn't started yet), it needs a hint.
The kubelet uses limits to stop the process when it uses more memory than is allowed.
It's worth noting that the process could spike in memory usage before it's terminated.
The kubelet is also in charge of monitoring the total memory utilization of the node.
If memory is running low, the kubelet evicts low-priority pods.
But how does it decide what's low priority?
When Kubernetes creates a Pod, it assigns one of these QoS classes to the Pod:
Pods that are "Guaranteed" have CPU and memory requests and limits and are least likely to face eviction.
Also, memory request = memory limit AND CPU request = CPU limit.
This class is best suited for stateful applications like databases.
Pods with a "Burstable" class have memory and CPU requests but not limits.
This allows the Pods to flexibly increase their resources if available (but they could also use any amount of resources).
A Pod is "BestEffort" only if none of its containers has a memory or CPU limit or request.
Those Pods are the first to be evicted in the event of Node resource pressure.
Most of your pods are likely to be "Burstable" (i.e. requests, but fewer limits), and a very selected few should be "Guaranteed".
Burstable pods are good because they use resources dynamically and are cheaper.
With Guaranteed pods, you allocate all resources up to the limit upfront, which could result in more expensive (but safer) deployments.
BestEffort pods are generally something you should avoid.
The Kubernetes scheduler doesn't know how much memory or CPU the process needs, so it could end up scheduling an impractical amount of pods in the existing nodes.
But if you stick only to Burstable pods, how does the kubelet know which pod to evict first?
Pods can have PriorityClass that indicates the importance of a Pod relative to other Pods.
The scheduler also leverages the Pod PriorityClass to evict pods when the cluster is full.
For example, if you have low-priority batch jobs (e.g. reports), you could assign a low priority, and they will be evicted first.
How should you choose the memory and request of a pod?
A simple way is to calculate the smallest memory unit as:
REQ = NODE_MEM / MAX_PODS_PER_NODE
For a 4GB node and a limit of 10 Pods, that's a 400Mb request.
Assign the smallest unit or a multiplier to your containers.
A better approach is to monitor the app and derive the memory utilization.
You can do this with your existing monitoring infrastructure or use the Vertical Pod Autoscaler to monitor and report the average request value.
How should I set the limits?
Limits trigger eviction, so you should definitely set a value lower than the available memory.
Also, if you want to dig in more a few relevant links:
And finally, if you've enjoyed this thread, you might also like:
- The Kubernetes workshops that we run at Learnk8s https://learnk8s.io/training
- This collection of past threads https://twitter.com/danielepolencic/status/1298543151901155330
- The Kubernetes newsletter I publish every week https://learnk8s.io/learn-kubernetes-weekly