By ShurikAg
We followed the starter kit to kick start our cluster and I have a couple of questions.
#1 We bootstrapped the cluster following this: https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers/tree/main/15-automate-with-terraform-flux#step-2---bootstrapping-doks-and-flux-cd However, initially created a cluster with 2 nodes.
What is the correct way to resize the cluster? Do I simply update the main.tf and apply the changes? Will it destroy anything in the cluster?
#2
After automating everything and having everything setup I took a quick look at the Compute Resource dashboard. Similar to this. And it appears that I am at 147% of CPU Limits Commitment before I even started.
Digging a bit deeper, it looks like flus-system consumes up to 4 cpu’s. + ambassador up to 2. In our case, it is more than half of the cluster (we have a simple app). Of course, we can resize the cluster, but is that expected?
UPDATE
If I try to update main.tf to 3 nodes and run:
terraform plan -out priz_prod_cluster.out
I am getting the following error.
│ Error: Error retrieving Kubernetes cluster: GET https://api.digitalocean.com/v2/kubernetes/clusters/f9883560-f07a-4e54-9520-97f3210cb47b: 401 Unable to authenticate you
│
│ with module.doks_flux_cd.digitalocean_kubernetes_cluster.primary,
│ on .terraform/modules/doks_flux_cd/create-doks-with-terraform-flux/main.tf line 39, in resource "digitalocean_kubernetes_cluster" "primary":
│ 39: resource "digitalocean_kubernetes_cluster" "primary" {
│
╵
All my keys and environment variables are updated and correct.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Hey!
Addressing your questions involves several aspects of managing a DOKS cluster, especially when using Terraform for infrastructure as code and when dealing with resource utilization by system components like Flux CD and Ambassador.
Update main.tf: Yes, the correct way to resize your cluster is by updating the node_count in your main.tf file where you’ve defined your DOKS cluster configuration. You would adjust the count to the desired number of nodes.
Applying Changes Safely: When you apply the changes with terraform apply, Terraform will only perform actions required to reach the desired state. In the case of increasing the node count, it should not destroy any existing resources within your cluster. Terraform is designed to be idempotent, meaning it should safely apply only the necessary changes to achieve the specified configuration.
Before applying, always run terraform plan to review the proposed changes. This command shows what Terraform will do without actually performing the actions, providing an additional layer of safety.
No Destruction: Increasing the node count should not destroy anything in the cluster. Kubernetes and the cloud provider’s controller will handle the addition of new nodes, and workloads will continue running as before. If you’re scaling down, Kubernetes will attempt to reschedule workloads from the terminated nodes to the remaining ones, assuming sufficient resources are available.
Regarding the high CPU utilization by system components:
Resource Utilization: It’s not uncommon for system components, especially those involved in managing the cluster and handling ingress traffic, to consume a significant portion of resources. However, if these components are consuming an unexpectedly high amount of resources, it might be worth investigating their configurations.
Adjusting Resource Requests and Limits: You can adjust the CPU and memory requests and limits for Flux and Ambassador if you find that they’re consuming more resources than necessary. This can be done by modifying their deployment configurations. Be cautious when adjusting these values to ensure that you don’t starve these critical components of needed resources.
Scaling the Cluster: If your applications and system components require more resources than initially anticipated, scaling the cluster by adding more nodes or using nodes with higher resource capacities is a valid approach. This will provide more CPU and memory resources for all workloads, including system components.
The error you’re encountering when running terraform plan suggests an issue with authentication to the DigitalOcean API:
Check Environment Variables: Ensure that your environment variables for DigitalOcean authentication (DIGITALOCEAN_TOKEN or any other relevant variables) are correctly set in the environment from which you’re running Terraform. Use echo $DIGITALOCEAN_TOKEN (on Linux/macOS) or echo %DIGITALOCEAN_TOKEN% (on Windows) to verify that the token is correctly set.
Refresh Terraform State: Sometimes, Terraform’s state can become out of sync with the actual infrastructure, especially if changes were made outside of Terraform. Running terraform refresh can help reconcile Terraform’s state with the real world.
Review Terraform Version and Provider: Ensure that you’re using a version of the Terraform DigitalOcean provider that supports the features and resources you’re using. You may need to update the provider version in your configuration.
Best,
Bobby
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.