Introduction:
Containerisation has revolutionised the way we develop and deploy software applications, enabling us to run them consistently across different environments and infrastructures. However, containers also require careful management of their resources, including CPU and memory, to avoid performance issues, instability, and cost overruns. In this article, we’ll discuss the reasons why setting CPU limits for containers is crucial for their reliability, scalability, and efficiency.
![](https://www.developerscoffee.com/wp-content/uploads/2023/03/image-1-1024x632.png)
In Kubernetes, you have two ways to specify how much CPU a pod can use:
- “Requests” are usually used to determine the average consumption.
- “Limits” set the max number of resources allowed.
The Kubernetes scheduler uses requests to determine where the pod should be allocated in the cluster. Since the scheduler doesn’t know the consumption (the pod hasn’t started yet), it needs a hint.
CPU requests are also used to repart the CPU to your containers. For example:
- A node has a single CPU.
- Container A has requests = 0.1 vCPU.
- Container B has requests = 0.2 vCPU.
Since the CPU requests don’t limit consumption, if both containers compete to use 100% of all available CPUs, container 1 will use 0.3vCPU and the other 0.6vCPU (double).
Requests are suitable for:
- Setting a baseline (give me at least X amount of CPU).
- Setting relationships between pods (this pod A uses twice as much CPU as the other).
But do not help set hard limits. For that, you need CPU limits.
When you set a CPU limit, you define a period and quota. For example:
- period: 100000 microseconds (0.1s).
- quota: 10000 microseconds (0.01s).
I can only use the CPU for 0.01 seconds every 0.1 seconds. That’s also abbreviated as “100m”. If your container has a hard limit and wants more CPU, it has to wait for the next period (the process is throttled).
So, what should you use as CPU requests and limits in your Pods?
You can monitor the app and derive the average CPU utilization. You can do this with your existing monitoring infrastructure or use the Vertical Pod Autoscaler to monitor and report the average request value.
How should I set the limits?
- Your app might already have “hard” limits. (Node.js is single-threaded and uses up to 1 core even if you assign 2).
- You could have: limit = 99th percentile + 30-50%.
Should you always set the CPU request?
Always! This is a standard good practice in Kubernetes and helps the scheduler allocate pods more efficiently.
Should you always set the CPU limit?
We have explain some identify reasons as below:
Reason #1:
CPU Limits Prevent Resource Starvation One of the primary reasons to set CPU limits for containers is to prevent resource starvation, which occurs when a container consumes more CPU than its share, thereby reducing the availability of CPU for other containers on the same node or cluster. Resource starvation can lead to degraded performance, increased latency, and even pod eviction, as Kubernetes may terminate containers that exceed their CPU limits for too long. By setting CPU limits, you can ensure that each container has a fair share of CPU resources and that the overall workload can run smoothly.
Reason #2:
CPU Limits Help with Predictability and Stability Another reason to set CPU limits for containers is to improve their predictability and stability, which are critical for production-grade applications. Containers that consume more CPU than their requests can become unpredictable and unstable, as they may cause kernel-level contention, CPU spikes, or jittery response times. In contrast, containers that adhere to their CPU limits can provide consistent and reliable performance, making them easier to monitor, troubleshoot, and scale.
![](https://www.developerscoffee.com/wp-content/uploads/2023/03/image-1024x590.png)
Reason #3:
CPU and Memory Usage Are Correlated CPU and memory usage are closely related in computing, as both affect each other’s performance and capacity. Containers that consume more CPU often require more memory to support their workloads, as they may generate more threads, data, or caching. Therefore, it’s essential to maintain a balanced ratio of CPU to memory in containers, such as 8×16 or 16×64, to avoid overprovisioning or underprovisioning of either resource. Setting CPU limits without considering memory usage can lead to unbalanced and inefficient resource utilization.
Reason #4:
Pod Autoscalers Can Do a Better Job Instead of relying on bursting CPU utilization to get more work done, it’s better to use pod autoscalers that can extend pod capacity in a predictable and balanced manner. Pod autoscalers, such as HPA and KPA, can increase the number of pod replicas, adding more aggregate CPU and memory capacity to the entire workload without affecting the CPU and memory usage of individual containers. VPA can also increase the overall limits for CPU and memory, ensuring the pod experiences a balanced adjustment of its resource requests and limits. By using pod autoscalers, you can avoid the side effects of runaway threads or squeezing container probes out of their operational range.
Reason #5:
Hyper-scalers Require Container Limits Finally, many hyper-scalers and managed container services, such as Code Engine or ECS, require that you set CPU and memory limits for your containers. This requirement helps the providers with resource allocation and works in your favor too, as you pay by resource allocation, and limitless usage means limitless cost. Therefore, it’s better to design your containers with CPU limits from the beginning, rather than relying on them later.
Conclusion:
In conclusion, setting CPU limits for containers is crucial for their reliability, scalability, and efficiency. By preventing resource starvation, improving predictability and stability, maintaining a balanced ratio of CPU to memory, using pod autoscalers, and complying with hyper-scaler requirements, you can ensure that your containers perform optimally and cost-effectively. Therefore, it’s a sound practice to design containers with balanced resource utilization, closely set resource limits and requests
References
- Kubernetes CPU Management policies
- Layer-by-Layer Cgroup in Kubernetes, by Stefanie Lai:
- Kubernetes resources under the hood, by Shon Lev-Ran and Shir Monether.
- CFS Scheduler design
- Kill the Annoying CPU Throttling and Make Containers Run Faster