Question

Managed Kubernetes not working - cilium can't connect to etcd

Hi guys,

I already opened a support ticket, but I still have no replies since 3 days, so I wanted to try here too.

I use the managed kubernetes service with rancher and had it running smoothly. Then on monday morning, it suddenly stopped reporting to rancher and the deployed websites didn’t work anymore. I checked all pods and saw, that the cilium pods are restarting like crazy and most other pods are stuck in containerCreating.

It seems like the cilium pods can’t reach the etcd-node anymore. This is the log of one cilium node: https://gist.github.com/DTrierweiler/f2eecb5568fdf899695cb6f644318ffb I even downloaded the certs from the secret and tried to connect to the etcd from my local machine with curl - which worked without problems.

Could this be related to dns problems? The 2 coredns pods are not running as well because of being stuck in containerCreating.

Thanks a lot for your help. Besides this, is it normal for the support to take so much time? I have an unusable cluster (for 4 days now), which costs me 200$ per month and my websites are not running. Luckily this is still only staging and not production.

Cheers, Daniel


Submit an answer


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

jarland
DigitalOcean Employee
DigitalOcean Employee badge
March 7, 2019
Accepted Answer

Hey friend,

Per Nicholas from our support team:

We have seen a similar report of pods stuck in a ContainerCreating state and there might be a Cilium dependency issue; if you run kubectl -n kube-system edit ds cilium, what is dnsPolicy set to? If you change that to “ClusterFirst” or “Default”, does that resolve the issue?

I also wanted to quickly address this question:

is it normal for the support to take so much time?

It varies a bit. Our intention is to provide you with all of the things you need to troubleshoot and repair problems from your side, without having to wait for a response from our team. On the rare occasion that you do not have the ability to resolve an issue on your side and our intervention is required, such wait time is obviously unacceptable, and it is something we are working very hard on improving. By continually exposing customers to the right information up front, and getting better about providing a clear user experience as we go, we hope to see more customers empowered to solve problems so that we can be more available for the rare opportunities that you absolutely need us.

Jarland

I am in the same boat waiting for support to address a related issue. Though my dnsPolicy is already set to ClusterFirst. I am finding myself debugging cilium issues very frequently to the point that I am questioning whether DigitalOcean’s offering is truly a “managed” offering.

At least if I host my own kubernetes distribution, I would have some control over the setup as opposed to having to wait a few days on an answer.

Hey Jarland,

thank you so much. The dnsPolicy was set to ClusterFirstWithHostNet and a change to ClusterFirst did the trick. It’s running again.

Do you know why it was set to ClusterFirstWithHostNet and why it stopped working from one moment to the next? Documentation says this value should only be used, when you use hostNetwork: true which is not the case.

I think you do a good job in providing a lot of information to fix and repair problems - but it all comes down to those rare occasions you mentioned (like in this case). I’m still not sure, why the ticket was unanswered for 3 days - correct me if I’m wrong, but I thought the main benefit of having a managed kubernetes service is not to worry about this exact problem.

Anyway - thanks a lot for the reply. Maybe it’ll help someone else as well :)

Cheers

Try DigitalOcean for free

Click below to sign up and get $200 of credit to try our products over 60 days!

Sign up

Get our biweekly newsletter

Sign up for Infrastructure as a Newsletter.

Hollie's Hub for Good

Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

Become a contributor

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

Welcome to the developer cloud

DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

Learn more
DigitalOcean Cloud Control Panel