Question

Droplet remains alive but "falls off" network. OR... How to reboot automatically when condition detected?

Quickest version of question: What’s the simplest way to detect a network error (say, 8 failed pings in a row) and then trigger a reboot?

A bit more detail

I don’t seem to be able to resolve a networking issue (more on that below) and my only choice seems to be a complete rebuild.

Ubuntu 16.04 (Xenial Xerus) is due out on 21st April - does anyone know how quickly DO get the new images ready to deploy as a droplet? It’s a fairly complex build, I want to move to php7 at the same time, and I don’t have time to check everything this week.

In depth - diagnostic level!

Meantime, I have to keep this droplet up, and this problem has been plaguing my Ubuntu 15.10 4.2.0-34-generic i686 VPS for a couple of months now.

There will be a small (2-3Mb/s) spike in outgoing network traffic, and then the VPS will cease to respond to any connections on any port, with the exception of DigitalOcean’s proprietary Hypervisor-level console “which is like plugging a keyboard in”. When I login like that, all services seem to be up. But the ONLY way to get the system back on like is a shutdown -r now.

Support say they’ve investigated and can find no reason for this.

All I know is that downloading any large file triggers this condition, and that at the same time, the time the weekly backups take changed from 32 mins to about 15 mins, despite the droplet remaining the same size.

I’ve checked that there’s no fail2ban weirdness going on. I’ve tried running without iptables firewall, I’ve looked for clues in the nginx webserver logs, and there’s nothing I can find. Also, DigitalOcean’s control panel keeps logging throughout the outage, and there is no CPU load spike either (so I doubt it’s DDOS).

Anyone got any ideas?

I’ve got an NFS connection to another machine, and this is what syslog and kern.log are showing at the times the machine last dropped off the network:

syslog

Mar 20 11:17:01 tns2000 CRON[8852]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Mar 20 11:29:00 tns2000 kernel: [79745.504204] nfs: server 46.101.85.XX not responding, timed out
Mar 20 11:32:11 tns2000 kernel: [79935.988165] nfs: server 46.101.85.XX not responding, timed out
Mar 20 11:32:11 tns2000 kernel: [79935.988236] nfs: server 46.101.85.XX not responding, timed out
Mar 20 11:32:33 tns2000 systemd-timesyncd[369]: Timed out waiting for reply from 91.189.94.4:123 (ntp.ubuntu.com).
Mar 20 11:32:43 tns2000 systemd-timesyncd[369]: Timed out waiting for reply from 91.189.89.199:123 (ntp.ubuntu.com).
Mar 20 11:32:53 tns2000 systemd-timesyncd[369]: Timed out waiting for reply from [2001:67c:1560:8003::c7]:123 (ntp.ubuntu.com).
Mar 20 11:35:40 tns2000 kernel: [80145.500142] nfs: server 46.101.85.XX not responding, timed out```

kern.log

```Mar 20 11:29:00 tns2000 kernel: [79745.504204] nfs: server 46.101.85.XX not responding, timed out
Mar 20 11:32:11 tns2000 kernel: [79935.988165] nfs: server 46.101.85.XX not responding, timed out
Mar 20 11:32:11 tns2000 kernel: [79935.988236] nfs: server 46.101.85.XX not responding, timed out
Mar 20 11:35:40 tns2000 kernel: [80145.500142] nfs: server 46.101.85.XX not responding, timed out```

Submit an answer

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

Sign In or Sign Up to Answer

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Want to learn more? Join the DigitalOcean Community!

Join our DigitalOcean community of over a million developers for free! Get help and share knowledge in Q&A, subscribe to topics of interest, and get courses and tools that will help you grow as a developer and scale your project or business.

First, I wanted to let you know that we will have images for 16.04 out very quickly after the release. We try to quickly get these to our users after the official launch.

As far as automatic reboots, or failover, you may want to look at https://www.digitalocean.com/community/tutorials/how-to-create-a-high-availability-setup-with-heartbeat-and-floating-ips-on-ubuntu-14-04 or https://www.digitalocean.com/community/tutorials/how-to-set-up-highly-available-web-servers-with-keepalived-and-floating-ips-on-ubuntu-14-04 in order to either restart or move traffic to another site while your Droplet is facing issues.