Kubernetes deployment with external load balancer: zero downtime rollouts

July 31, 2019 276 views
Kubernetes Deployment Load Balancing High Availability

Environment

My Kubernetes cluster only has 1 node for now - managed by DigitalOcean.

The web application that I deployed runs in 3 pods - all on ONE node. I used the external DigitalOcean's load balancer to expose the application outside the cluster.

Here's the k8s resource definitions:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: shovik-com
  labels:
    app: shovik-com
spec:
  replicas: 3
  selector:
    matchLabels:
      app: shovik-com
  template:
    metadata:
      labels:
        app: shovik-com
    spec:
      containers:
      - name: shovik-com
        image: aspushkinus/shovik:latest
        imagePullPolicy: Always
        ports:
          - containerPort: 80
        envFrom:
          - secretRef:
              name: shovik-com
---
apiVersion: v1
kind: Service
metadata:
  name: shovik-com-balancer
  annotations:
    service.beta.kubernetes.io/do-loadbalancer-tls-ports: "443"
    service.beta.kubernetes.io/do-loadbalancer-certificate-id: "do-cert-id"
    service.beta.kubernetes.io/do-loadbalancer-redirect-http-to-https: "true"    
spec:
  type: LoadBalancer
  selector:
    app: shovik-com
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80
    - name: https
      protocol: TCP
      port: 443
      targetPort: 80

This works great and the website is live: https://shovik.com/

The problem

Whenever I deploy the new version of the app using the standard k8s rolling strategy, my app goes down for a minute and DigitalOcean's load balancer responds with "503 Service Unavailable". This is despite the fact that at any given time there are at least 2 pods in the "running" status.

Question

How can I implement a zero-downtime deployment using DigitalOcean's k8s and load balancer? Should I put another NodePort service in front of the LoadBalancer?

3 Answers
alexkovshovik July 31, 2019
Accepted Answer

This is fixed now, I asked too soon, but I hope this will help someone else: I had to add livenessProbe and readinessProbe to my deployment - to have kubelet check to make sure my pods are ready to start accepting traffic.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

Updated deployment resource:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: shovik-com
  labels:
    app: shovik-com
spec:
  replicas: 3
  selector:
    matchLabels:
      app: shovik-com
  template:
    metadata:
      labels:
        app: shovik-com
    spec:
      containers:
      - name: shovik-com
        image: aspushkinus/shovik:latest
        imagePullPolicy: Always
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 15
          periodSeconds: 15
        ports:
          - containerPort: 80
        envFrom:
          - secretRef:
              name: shovik-com

Hi there!

Thank you for providing your solution. I am happy to see this resolved your issue.

To reduce downtime potential, I would highly recommend at least a second node, ideally 3. Single node clusters are not a solid foundation for production workloads. I would argue that focusing on this single point during rollouts would be less important as rollouts are controlled by you. I think dealing with node/infra outages which are bound to happen(patching, upgrades, maintenance, failure, etc), may be a more valuable step in our quest for uptime. Just food for thought.

Regards,

John Kwiatkoski
Senior Developer Support Engineer

  • Hi John,

    All good points, thank you.

    I'm planning to add more nodes to my cluster after I finish migration from my pre-kubernetes deployment stuff. I need to delete a few old droplets first to manage the cost as I'm doing this for my personal projects for now.

    Best,
    Alex Kovshovik.

Have another answer? Share your knowledge.