Updating network configuration on droplet restored from snapshot?

November 7, 2018 828 views
Networking Ubuntu

So I recently had an issue with networking on a new droplet. I created the droplet from a snapshot and once it booted, I couldn’t connect to it over SSH. I accessed it over the console through the control panel and confirmed it had no network connection whatsoever. After a little digging I discovered that the network interface configuration was still pointing at the previous droplet’s gateway. I checked the networking page for the new droplet and updated it accordingly, and voila.

However, when trying to diagnose this, I did a little searching (and have done some more since) and I can’t seem to find many people who have had this problem, or had to update this configuration themselves after restoring from snapshot. But it seems like a common operation, in fact, DO themselves have articles on this process themselves (e.g. https://www.digitalocean.com/docs/images/snapshots/how-to/migrate-droplets/) and it doesn’t mention anything about having to update the networking configuration.

So my question is not really why did this happen, I understand what led to it, and it makes sense, I transferred a snapshot which included the old network configuration and so it needed updating. My question is more, how is this not an issue for people constantly, to the extent that DO have to include a note in their help articles on this process? I know I didn’t get unlucky, as all droplets created in that region now seem to use the new gateway, so I’m guessing it’s something that happens every time a subnet fills up; droplets will start being created in the new subnet.

So what gives? How is this not a more common issue? Why is it not easier to diagnose / pre-empt?

1 comment
1 Answer

Hey friend!

Great question. All of our images include cloud-init, which queries our infrastructure for network information to define the configuration at boot time. If a droplet has cloud-init removed, or damaged in some way, prior to snapshot then you would see this behavior that you are speaking of.

Jarland

  • Ah okay, I can’t tell because your cloud-init stuff is all .pkl and I haven’t had time to work out how to unpickle them (my python is rusty), but my guess is it might be the change in network interface naming scheme that came in with systemd v197 (https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/), as the cloud-init scripts specifically target eth0. My question is do you plan to update this script to mitigate against this happening in future? (For example by storing the gateway from previous boots, and parsing files in etc/network/interfaces.d for “gateway” directives that relied on that previous interface? Rather than targeting the network interfaces by name.)

    It’s not ideal for users to have to hunt down the source of these issues when the setup of a droplet doesn’t make this transparent, and so they never have to learn how the networking interfaces are configured in the first place. If you’re going to automatically define this configuration, that automation should be resilient, or you should not do it at all so that users have control over this.

Have another answer? Share your knowledge.