Do you specify requests & limits for your Kubernetes pods?
Here is why it's very important.
When deploying a pod, kubernetes assigns a QoS class to pods based on the requests and limit parameters.
Let's understand Kubernetes Pod Quality of service. (QoS).
Kubernetes pod scheduling is based on the request value to ensure the node has the resources to run the pod.
However, a node can be overcommitted if pods try to utilize all its limit ranges more than the node's capacity.
Overcommitment = sum of resource request/limits > node capacity
When pods on the node try to utilize resources that are not available on the node, kubernetes uses the QoS class to determine which pod to kill first.
Types of Pod QoS Class
Following are the three types of Pod QoS class
👉 Best effort
Your pod gets the best-effort class if you do not specify any CPU/Memory requests and limits. Best-effort pods are low-priority pods. The best-effort pods get killed first if the node runs out of resources.
👉 Burstable
If you set the request lower than the limit, the pod gets burstable class. If the node runs out of resources, burstable pods get killed if no best effort pods are available.
👉 Guaranteed
The pod gets a guaranteed class if the request and limit values are the same. It is considered the highest priority pod and gets killed if there are no best-effort or burstable pods.
🧵 Pod QoS FAQ’s
Following are the discussion that happened on LinkedIn for this topic
Question 01: While the Guaranteed Pod QoS offers the highest reliability, would the design not lose out on Kubernetes' scalability benefits? To protect against pod eviction & failures, there would be a need to keep the resource request values high which means Kubernetes reserves more than needed during non-peak times.
Answer: Well, burstable is definitely out of the picture in production. But you can consider the other two classes as part of the capacity planning based on your workload types. You need to consider pod priority class as well. Here is a blog that might help answer your queries. https://sysdig.com/blog/kubernetes-resource-limits/
Question 02: What happens when there are 2 pods with guaranteed QoS and there are no best effort or burstable pods and we need resources back for cluster? Which one of the two will be killed? Am just curious to know the algorithm used in this case
Answer: Pods get ranked on priority for termination. If multiple pods have the same priority, then pods that are most over the request are terminated first.