Expected behavior when nodes are powered down

November 19, 2019 245 views
Load Balancing Kubernetes

I set up a cluster with three nodes. They all got added to the LB and everything appeared fine. Then to simulate a failure, I powered down the first node droplet, expecting the LB to just stop sending traffic to it. However, to my surprise all nodes are showing down now, even though the individual droplets appear to be up. I even brought back the first node droplet and the LB is still showing all down. What did I do wrong/ what should I expect to have happened? Or is this a bug?

5 comments
  • Hi there,

    Can you provide a bit more detail into the configuration of the service object that the LB is serving? How many pods are you using as valid endpoints to the service.

    Regards,

    John Kwiatkoski
    Senior Developer Support Engineer

  • Hi John,

    There were three nodes, each with one or two pods. I powered down the first vm and then the LB did not redistribute traffic to the two remaining up. When I powered it back on, the LB showed all nodes as down. The only way I could get it to recover was to create a new node pool and delete the original.

    Not sure if its a misconfiguration on my side, I am happy to post my yml files if that would be helpful.

    Scott

  • I am using the following service. I have a feeling that for some reason the LB is not able to accurately detect when the node/pod is ready to serve requests. I have run into this problem a couple other times where the LB seems to get “stuck” with one or more nodes showing down, even though querying the pods with kubectl describe seems to show they are operating normally.

    apiVersion: v1
    kind: Service
    metadata:
      name: helpymtlb
    spec:
      type: LoadBalancer
      selector:
        app: helpymt
      ports:
        - protocol: TCP
          port: 80
          targetPort: 3000
          name: http
    
  • @jkwiatkoski bump. could you let me know if i am doing something wrong?

  • The configuration looks pretty strightforward. I would like to highlight that the LB will behave differently based on the service’s externalTrafficPolicy.

    The LB is for the service it was created by. Not any other pods. Did you have a pod for the ‘helpymtlb’ service on each node?
    What size nodes are you using? I often find issues can pop up with the smallest node sizing as they are meant for exploration and education rather than production workloads. OOM/resource issues can cause weird behavior on them like causing pods to be too slow to respond to their readinessprobe and showing as down to the LB.

1 Answer

Hi, I’m not sure if DO has support for out of the box node auto-provisioning here. From a quick read, it doesn’t appear to be enabled by default. Thus, I believe that this is expected by default if you didn’t set this up during your provisioning. I recommend taking a look at their online documentation here to get set up if you haven’t done it:

https://www.digitalocean.com/docs/kubernetes/how-to/autoscale

Well, I wish that this information helps you and please do have a great day.

Have another answer? Share your knowledge.