How do you handle HA in DigitalOcean?

December 3, 2019 205 views
High Availability

Yes, DO has some great tutorials on HA… but they all focus on HA within a single DO data center.

I may be a bit old school, but building HA within a single data center leaves the possibility that human error or natural disaster could introduce a failure that impacts the entire data center. Are DO users concerned with this or you believe that DO has assets within a data center isolated so the entire center cannot be impacted by one failure?

Building proxies or load balancers across two data center without having floating IPs across data centers seems like a significant challenge. I thought I had a solution, but it failed last week so I’m on the hunt again. Has anybody come up with a viable approach to this challenge?

The situation we ran into last week involved emergency maintenance on a hypervisor that hosted one of our proxies. The hypervisor was brought down in a way that allowed the browser to still connect even though our droplet could not respond. We publish two IP addresses for proxies in two different data centers. Browsers will usually try to connect a second published IP address if they cannot connect to the first, but in this situation they were able to connect but got no response. A connection failure/timeout is handled different than a response timeout and the browsers did not try to connect to the second provided IP address. We were able to drop the second address from our DNS, but that still resulted in at least 10 minutes of outage while waiting for the TTL to expire. We can shorten the TTL and set up a monitor with a short response timeout, but there really has to be a better solution.

We would prefer to stick with DO rather than switching to a provider that has virtual IPs across availability centers, so I wanted to ask the question here to see what advice the DO community can provide.

Thanks,

1 Answer

Hey @nusbaum - great question. I think there are multiple ways you can go about setting up multi-region HA, so I’ve asked internally for some more knowledgeable folks to chime in with answers.

But to start, one approach that we’ve taken internally uses Cloudflare and Kubernetes, you can read about it here: https://blog.digitalocean.com/how-we-launched-our-marketplace-using-digitalocean-kubernetes-part-1/

Have another answer? Share your knowledge.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!