Question

Kubernetes deployment with external load balancer: zero downtime rollouts

Environment

My Kubernetes cluster only has 1 node for now - managed by DigitalOcean.

The web application that I deployed runs in 3 pods - all on ONE node. I used the external DigitalOcean’s load balancer to expose the application outside the cluster.

Here’s the k8s resource definitions:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: shovik-com
  labels:
    app: shovik-com
spec:
  replicas: 3
  selector:
    matchLabels:
      app: shovik-com
  template:
    metadata:
      labels:
        app: shovik-com
    spec:
      containers:
      - name: shovik-com
        image: aspushkinus/shovik:latest
        imagePullPolicy: Always
        ports:
          - containerPort: 80
        envFrom:
          - secretRef:
              name: shovik-com
---
apiVersion: v1
kind: Service
metadata:
  name: shovik-com-balancer
  annotations:
    service.beta.kubernetes.io/do-loadbalancer-tls-ports: "443"
    service.beta.kubernetes.io/do-loadbalancer-certificate-id: "do-cert-id"
    service.beta.kubernetes.io/do-loadbalancer-redirect-http-to-https: "true"    
spec:
  type: LoadBalancer
  selector:
    app: shovik-com
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80
    - name: https
      protocol: TCP
      port: 443
      targetPort: 80

This works great and the website is live: https://shovik.com/

The problem

Whenever I deploy the new version of the app using the standard k8s rolling strategy, my app goes down for a minute and DigitalOcean’s load balancer responds with “503 Service Unavailable”. This is despite the fact that at any given time there are at least 2 pods in the “running” status.

Question

How can I implement a zero-downtime deployment using DigitalOcean’s k8s and load balancer? Should I put another NodePort service in front of the LoadBalancer?


Submit an answer


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Accepted Answer

This is fixed now, I asked too soon, but I hope this will help someone else: I had to add livenessProbe and readinessProbe to my deployment - to have kubelet check to make sure my pods are ready to start accepting traffic.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

Updated deployment resource:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: shovik-com
  labels:
    app: shovik-com
spec:
  replicas: 3
  selector:
    matchLabels:
      app: shovik-com
  template:
    metadata:
      labels:
        app: shovik-com
    spec:
      containers:
      - name: shovik-com
        image: aspushkinus/shovik:latest
        imagePullPolicy: Always
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 15
          periodSeconds: 15
        ports:
          - containerPort: 80
        envFrom:
          - secretRef:
              name: shovik-com

John Kwiatkoski
DigitalOcean Employee
DigitalOcean Employee badge
July 31, 2019

Hi there!

Thank you for providing your solution. I am happy to see this resolved your issue.

To reduce downtime potential, I would highly recommend at least a second node, ideally 3. Single node clusters are not a solid foundation for production workloads. I would argue that focusing on this single point during rollouts would be less important as rollouts are controlled by you. I think dealing with node/infra outages which are bound to happen(patching, upgrades, maintenance, failure, etc), may be a more valuable step in our quest for uptime. Just food for thought.

Regards,

John Kwiatkoski Senior Developer Support Engineer

This comment has been deleted

    Try DigitalOcean for free

    Click below to sign up and get $200 of credit to try our products over 60 days!

    Sign up

    Get our biweekly newsletter

    Sign up for Infrastructure as a Newsletter.

    Hollie's Hub for Good

    Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

    Become a contributor

    Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

    Welcome to the developer cloud

    DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

    Learn more
    DigitalOcean Cloud Control Panel