systemd dependencies . Running a script before shutdown

May 4, 2018 150 views
High Availability DigitalOcean Load Balancing Ubuntu 18.04

The requirement seems quite easy (before shutting down/rebooting remove a droplet tag) but getting it working just isn't happening.

I have the following script which works for quickly adding/removing a droplet to the loadbalancer tag
/root/SCRIPTS/loadbalancer.sh

#!/bin/bash
API=xyxyxyxyxyxyxyxyxyxyxyx
SID=`/snap/bin/doctl -t $API compute droplet list --format ID,Name | grep $HOSTNAME`
SID="$(echo $SID | cut -d ' ' -f1)"

function add {
        /snap/bin/doctl -t $API compute droplet tag $SID --tag-name=WEB-LBD
}

function remove {
        /snap/bin/doctl -t $API compute droplet untag $SID --tag-name=WEB-LBD
}

case "$1" in
add)
        echo "Adding Server to LoadBalancer..."
        add
        ;;
remove)
        echo "Removing Server from LoadBalancer..."
        remove
        ;;
debug)
        echo "Output:"
        echo "$SID"
        ;;
*)
        echo "(add|remove|debug)"
        ;;
esac

This works great (and I've made another script calling this one before and after running updates).
However what I was hoping to do, is to remove the server from the loadbalancer automatically before the server is reboot/shutdown

I've create /etc/systemd/system/loadbalancer-remove.service

[Unit]
Description=Remove Server from LoadBalancer on shutdown
After=network.target networking.service network-online.target nss-lookup.target systemd-resolved
Requires=network.target networking.service network-online.target nss-lookup.target systemd-resolved

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/bin/true
ExecStop=/root/SCRIPTS/loadbalancer.sh remove

[Install]
WantedBy=multi-user.target

The service is active but seems to run the exit after the network has already gone (or something it needs). According to some systemd programmer, all that's needed to ensure this exits before the network is shutdown is "network.target" but the droplet hangs on reboot. I believe this is because doctl obviously needs the network up to communicate and the network is going down before this is stopped but it's difficult to know this as I can't see any output as it's shutting down.
I've tried various Wants, Requires & After but it's the same result. I found a different way of doing it that uses halt.target reboot.target shutdown.target before starting this service, but that gave me the same issue : hang on reboot.

Just wondering if anyone has solved anything similar? it's not specifically a doctl issue but I dont know if doctl taking a few seconds is allowing the network to be shutdown before it's finished.

Any thoughts at all appreciated, I've been googling and trying stuff for 4 hours.

2 Answers

Doing some digging indicates that

After=networking.service

Should do the trick. If you remove all other items in After do you still encounter problems?

While this configuration should be achievable, it would not handle unplanned downtime such as crashes of the node. Are you monitoring your nodes externally as well to trigger the removal if one of your servers stops responding properly?

  • Thanks for the response, sorry I missed the reply hence the delay.

    I agree, from everything I've read it should be that, but it still makes the droplet hang on reboot.
    I did try it again today just in case I'd missed it, it still hangs :(

    Totally agree it wont handle unplanned downtime. Droplets are monitored with Nagios and while it's not yet configured to handle this, it's a work in progress (along with automatically deploying additional droplets if the servers come under load) to do so.
    This really was just a quick 'if we know it's going down update the loadbalancer' (in case we haven't already) i.e the update scripts already remove the tag before starting to apply updates and ask if you want to add it back once that's finished or leave it off for reboot.

    5 Min job to set it up to fire on shutdown/reboot lol (no such thing as a 5 min job).

    • Update on todays testing.
      I added some debug to the script to output ifconfig

      #!/bin/bash
      API=xyxyxyxyxyxyxyxyxyxyxyx
      SID=`/snap/bin/doctl -t $API compute droplet list --format ID,Name | grep $HOSTNAME`
      SID="$(echo $SID | cut -d ' ' -f1)"
      
      function add {
              /snap/bin/doctl -t $API compute droplet tag $SID --tag-name=WEB-LBD
      }
      
      function remove {
              /snap/bin/doctl -t $API compute droplet untag $SID --tag-name=WEB-LBD
      }
      
      case "$1" in
      add)
              echo "Adding Server to LoadBalancer..."
              add
              ;;
      remove)
              echo "Removing Server from LoadBalancer..."
              echo "1. Debugging..." >> /tmp2/loadbalancer.txt
              echo "2. Starting remove" >> /tmp2/loadbalancer.txt
              ifconfig >> /tmp2/loadbalancer.txt
      #       remove
              sleep 30
              echo "4. Finished removing." >> /tmp2/loadbalancer.txt
              ifconfig >> /tmp2/loadbalancer.txt
              echo "6. Finished." >> /tmp2/loadbalancer.txt
              ;;
      debug)
              echo "Output:"
              echo "$SID"
              ;;
      *)
              echo "(add|remove|debug)"
              ;;
      esac
      
      

      First I tried as normal, the output was 1, 2, 3 (3 being ifconfig) but nothing more.
      I then commented the remove and added a sleep 10, this worked and output 1, 2, 3, 4, 5, 6 (3 & 5 ifconfig).
      Increasing the sleep 20 works too, but 30 doesn't.
      What's interesting is sleep 30: I think the script is killed rather than allowed to finish. but it doesn't cause a hang.
      I'm now thinking this may not be that the network is disappearing, but doctl is taking to long and it and whatever is trying to kill the script.

      The work continues.

Have another answer? Share your knowledge.