I’ve woken up to find my services have all fallen over. Investigating that I’ve found all my k8s nodes are in NotReady
state. Deploying isn’t working.
No notifications about this happening. No emails. Nothing from DO to say “by the way, your nodes have fallen over”.
Can someone from DO help me with this?
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
Join our DigitalOcean community of over a million developers for free! Get help and share knowledge in Q&A, subscribe to topics of interest, and get courses and tools that will help you grow as a developer and scale your project or business.
Hello @colinjohnriddell ,
NotReady status on a node can be caused due to multiple reasons::
It is a best practice that Kubernetes nodes should be treated as ephemeral. Because of this, it is common to recycle a node that has an issue to replace it with a healthy node. This can fix many common problems specific to nodes. Generally, we see Node in Not Ready state due to the lack of resources.
If you want to check about the specific incident you can review events around the nodes using the following commands:
kubectl get nodes kubectl describe node <name_of_node> kubectl get events n kubesystem
Coming to the notification option, at present, this feature is not there. However, this is already there in our roadmap. I don’t have a specific ETA for it. Our product team always look for such feature request and product feedback, I request you to vote/add on the idea here and subscribe for updates: https://ideas.digitalocean.com/ideas/
We use that page to help gauge demand for new features, so adding it, or adding your vote, will help us to prioritize when we can implement this feature.
I hope this helps!
Best Regards, Purnima Kumari Developer Support Engineer II, DigitalOcean
No answer to this. something happened with DO clusters. DO blamed my nodes being OOM. They’re fine now and they were fine before.
Hi, how are you defining your pods for your service(s)? It’s not clear from your original post. If you have a public repository, can you drop a link in the comments? Well, I must go and I look forward to any additional feedback.
–
Think different and code well,
-Conrad