Question

Kubernetes deployment with external load balancer: zero downtime rollouts

Posted July 31, 2019 4.8k views
Load Balancing Deployment High Availability Kubernetes

Environment

My Kubernetes cluster only has 1 node for now - managed by DigitalOcean.

The web application that I deployed runs in 3 pods - all on ONE node. I used the external DigitalOcean’s load balancer to expose the application outside the cluster.

Here’s the k8s resource definitions:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: shovik-com
  labels:
    app: shovik-com
spec:
  replicas: 3
  selector:
    matchLabels:
      app: shovik-com
  template:
    metadata:
      labels:
        app: shovik-com
    spec:
      containers:
      - name: shovik-com
        image: aspushkinus/shovik:latest
        imagePullPolicy: Always
        ports:
          - containerPort: 80
        envFrom:
          - secretRef:
              name: shovik-com
---
apiVersion: v1
kind: Service
metadata:
  name: shovik-com-balancer
  annotations:
    service.beta.kubernetes.io/do-loadbalancer-tls-ports: "443"
    service.beta.kubernetes.io/do-loadbalancer-certificate-id: "do-cert-id"
    service.beta.kubernetes.io/do-loadbalancer-redirect-http-to-https: "true"    
spec:
  type: LoadBalancer
  selector:
    app: shovik-com
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80
    - name: https
      protocol: TCP
      port: 443
      targetPort: 80

This works great and the website is live: https://shovik.com/

The problem

Whenever I deploy the new version of the app using the standard k8s rolling strategy, my app goes down for a minute and DigitalOcean’s load balancer responds with “503 Service Unavailable”. This is despite the fact that at any given time there are at least 2 pods in the “running” status.

Question

How can I implement a zero-downtime deployment using DigitalOcean’s k8s and load balancer? Should I put another NodePort service in front of the LoadBalancer?

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

3 answers

This is fixed now, I asked too soon, but I hope this will help someone else: I had to add livenessProbe and readinessProbe to my deployment - to have kubelet check to make sure my pods are ready to start accepting traffic.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

Updated deployment resource:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: shovik-com
  labels:
    app: shovik-com
spec:
  replicas: 3
  selector:
    matchLabels:
      app: shovik-com
  template:
    metadata:
      labels:
        app: shovik-com
    spec:
      containers:
      - name: shovik-com
        image: aspushkinus/shovik:latest
        imagePullPolicy: Always
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 15
          periodSeconds: 15
        ports:
          - containerPort: 80
        envFrom:
          - secretRef:
              name: shovik-com

Hi there!

Thank you for providing your solution. I am happy to see this resolved your issue.

To reduce downtime potential, I would highly recommend at least a second node, ideally 3. Single node clusters are not a solid foundation for production workloads. I would argue that focusing on this single point during rollouts would be less important as rollouts are controlled by you. I think dealing with node/infra outages which are bound to happen(patching, upgrades, maintenance, failure, etc), may be a more valuable step in our quest for uptime. Just food for thought.

Regards,

John Kwiatkoski
Senior Developer Support Engineer

  • Hi John,

    All good points, thank you.

    I’m planning to add more nodes to my cluster after I finish migration from my pre-kubernetes deployment stuff. I need to delete a few old droplets first to manage the cost as I’m doing this for my personal projects for now.

    Best,
    Alex Kovshovik.

Submit an Answer