Question

Kubernetes automatic upgrades and cert-manager

Posted December 28, 2020 498 views
Kubernetes

I am using k8s cluster which includes cert-manager for SSL.

It’s a basic setup following: https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on-digitalocean-kubernetes

However, in the result I am getting errors on automatic k8s upgrades:

Validating webhook with a TimeoutSeconds value greater than 29 seconds will block upgrades.
Validating webhook is configured in such a way that it may be problematic during upgrades.

My understanding is those are due to timeoutSeconds and failurePolicy, as seen here:
https://github.com/jetstack/cert-manager/search?q=failurePolicy

  timeoutSeconds: {{ .Values.webhook.timeoutSeconds }}
  failurePolicy: Fail

What’s the correct way to have both cert-manager and k8s auto-upgrades?

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

×
Submit an Answer
1 answer

Hello @jareksk ,

It appears your cluster’s webhooks may be preventing the cluster from functioning. Can you ensure that their webhooks for your cluster have a failure Policy is set to ‘Ignore’? Depending on the type of webhook it would be one of two commands:

kubectl get validatingwebhookconfigurations

kubectl get mutatingwebhookconfigurations

Then edit the failurePolicy to 'Ignore’ using:

kubectl edit <webhook_type_from_above> <name_of_webhook>

Setting these webhooks to 'Ignore’ failure allows the cluster to communicate while the webhook is unresponsive. These webhooks can cause issues during upgrades, or bootstrapping of infrastructure resources.

For example, if a node hosting the webhook goes down, then webhook will not respond to requests. This is problematic when nodes try and register to the master API and the master API cannot reach the webhook. The node registration fails (because the failurePolicy is set to 'Fail’) while waiting on the webhook to become available. But the webhook often can’t reschedule till a new node has been created. Thus creating a bit of a circular dependency scenario. This is why we recommend setting the failure policy to 'Ignore’ until we find a better way of handling these scenarios.

For the webhooks timeout error, please update the same according to the steps given here.

Best Regards,
Purnima Kumari
Developer Support Engineer II - DigitalOcean

  • Hello @Purnima,

    I have the same problem as the original author and tried following your advice. However, after successfully editing both configurations I still get the same two errors from the linter:

    Validating webhook with a TimeoutSeconds value greater than 29 seconds will block upgrades.
    Mutating webhook with a TimeoutSeconds value greater than 29 seconds will block upgrades.
    

    Do I have to tell kubernetes that I have edited the configurations somehow?

    Cheers
    Finn

  • @Purnima, this worked for me. thank you so much!