spenglert
By:
spenglert

Create Snapshot with API and identifying it later

January 24, 2017 464 views
API

So I want to create a Snapshot from a Droplet and then create a new Droplet from said Snapshot (and delete the old Droplet once I confirmed that it is working as expected).

The whole point in doing so is to change the IP address of the Droplet.

Technically it is not a problem but it involves some horrible looping which I really can't make a "while True".

Once I created the snapshot via /v2/droplets/$DROPLET_ID/actions, the response doesn't tell me how to identify the snapshot once it has been created but solely the action id.

I could of course request /v2/actions/$ACTION_ID until the status is not "in-progress" but this wouldn't help much. So what I do is requesting /v2/snapshots up to 10 times with a sleep(30) (Python) in between and hoping to identify the droplet by name.

Drawbacks are of course endless. Has anyone done something similar with a better approach?

3 Answers

Why do you need to change the IP address of a Droplet so often?

  • It isn't that often (maybe two times per day). I use the droplet for web scraping (and use the scraped data for scientific purposes - no financial gain, no publishing or selling of the data).

    • If it's for scientific purpose, I recommend getting permission from those sites.

      • Not an option. It's time based data that I doubt they store. Aggregating those data for me would take them a lot of effort.

        Proxy servers aren't either an option. Their IP addresses are often already blocked.

        Out of curiosity: Do you know a better approach to my problem?

        • The better approach is to start getting the message they don't want you to scrape.

@spenglert

Using a while loop to check the status is most likely going to be the best route, though each action does provide started_at and completed_at times, which could be used to verify when an action is started and when it completes.

You could use date/time comparison to verify completion, though you'd still be using a while loop as means to continue to check.

The only issue to consider here is that if the continuous IP swapping as a result of scraping ends up resulting in numerous IP's being blocked and sent to blacklists, you may potentially risk your account being suspended. This doesn't mean you will be, though it is something I would be cautious of. Since the IP's your flushing in and out are reassigned to other users, if they are blocked, either the next user or DigitalOcean has to spend time (and potentially money) to get those IP's cleaned up.

Hey @spenglert,

I had to do something like this for SnapShooter.io

The solution we went for was giving the snapshot a unique name. We start the snapshot and store the action id. We then keep pooling the action via its ID to see if its status changes to completed.

Once we have a completed status we then look through all of the accounts snapshots and match up the unique name we gave it. We then know the snapshots Id for later use.

Have another answer? Share your knowledge.