Kubernetes automatic upgrades and cert-manager

I am using k8s cluster which includes cert-manager for SSL.

It’s a basic setup following:

However, in the result I am getting errors on automatic k8s upgrades:

Validating webhook with a TimeoutSeconds value greater than 29 seconds will block upgrades.
Validating webhook is configured in such a way that it may be problematic during upgrades.

My understanding is those are due to timeoutSeconds and failurePolicy, as seen here:

  timeoutSeconds: {{ .Values.webhook.timeoutSeconds }}
  failurePolicy: Fail

What’s the correct way to have both cert-manager and k8s auto-upgrades?


Submit an answer
You can type!ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Hello @jareksk ,

It appears your cluster’s webhooks may be preventing the cluster from functioning. Can you ensure that their webhooks for your cluster have a failure Policy is set to ‘Ignore’? Depending on the type of webhook it would be one of two commands:

kubectl get validatingwebhookconfigurations

kubectl get mutatingwebhookconfigurations

Then edit the failurePolicy to ‘Ignore’ using:

kubectl edit <webhook_type_from_above> <name_of_webhook>

Setting these webhooks to ‘Ignore’ failure allows the cluster to communicate while the webhook is unresponsive. These webhooks can cause issues during upgrades, or bootstrapping of infrastructure resources.

For example, if a node hosting the webhook goes down, then webhook will not respond to requests. This is problematic when nodes try and register to the master API and the master API cannot reach the webhook. The node registration fails (because the failurePolicy is set to ‘Fail’) while waiting on the webhook to become available. But the webhook often can’t reschedule till a new node has been created. Thus creating a bit of a circular dependency scenario. This is why we recommend setting the failure policy to ‘Ignore’ until we find a better way of handling these scenarios.

For the webhooks timeout error, please update the same according to the steps given here.

Best Regards, Purnima Kumari Developer Support Engineer II - DigitalOcean