Hey all,
I’m running a Kubernetes cluster on DigitalOcean and looking for tips on optimizing node scaling to balance cost and performance. How do you manage scaling down during off-peak hours without affecting critical services? Any tools or strategies you’d recommend?
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
Sign up for Infrastructure as a Newsletter.
Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Hey there!
The DigitalOcean Managed Kubernetes services comes with a built-in autoscaler that can automatically adjust the number of nodes in your cluster based on the workload. During off-peak hours, the autoscaler can scale down your Kubernetes nodes, reducing your costs.
Also in addition to that, you can use
PodDisruptionBudget
(PDB) which specifies the minimum number of replicas that an application can tolerate having during a voluntary disruption, relative to how many it is intended to have. For example, if you set thereplicas
value for a pod to5
, and set the PDB to1
, potentially disruptive actions like cluster upgrades and resizes occur with no fewer than four pods running.You can check out the official documentation on how to do that here:
In addition to the node autoscaling, you should also set up a Horizontal Pod Autoscaler (HPA) which scales the number of pods in your deployment based on observed CPU utilization (or other select metrics). This allows your services to handle more traffic when needed and scale down during quieter times.
You can follow the steps on how to do that here:
Basically, combining HPA with node autoscaling provides a very good approach to resource allocation, so that you don’t over-provision during low-demand periods but also scale accordingly whenever your load goes up.
Additionally, you should also look into setting resource requests and limits:
requests
- Specifies how much of a resource (such as CPU and memory resources) a pod is allowed to request on a node before being scheduled. If the node doesn’t have the available resources, the pod will not be scheduled. This prevents pods from being scheduled on nodes that are already under heavy workload.limits
- Specifies the amount of resources (such as CPU and memory resources) a pod is allowed to utilize on a node. This prevents pods from potentially slowing down the work of other pods.You can follow the steps on how to set that up here:
And one more thing, you should also keep an eye on the basic metrics that come out of the box which includes CPU usage, load averages, bandwidth, and disk I/O. And if you want to take this a step further, consider setting up advanced monitoring as well:
If you have any other questions, feel free to ask!
- Bobby