Can a load balancer help me here?

February 24, 2017 270 views
Clustering Ubuntu 16.04

I'd like to set up, potentially, hundreds of 512mb droplets to tackle 10,000+ image conversions per hour. Is this is a good use-case for DO's new load balancers? If so, what could the basic design/architecture look like?

Our current stack is LAMP on Ubuntu 16.04.

Thank you,

1 Answer

@stevetenuto

Load Balancing in general would be useful if you need high-availability, though whether or not they'd be an ideal fit for what you're doing depends on what you actually need.

If you're planning on working with hundreds of Droplets, or potentially hundreds++, the only real way to go about this would be working with the API as configuring all of that by hand through the CP will be very time-consuming.

About Load Balancers

DigitalOcean's Load Balancers accept incoming traffic and route it to the first server configured and responding to a health check on port 80, though you do have the option of using another port or if that's not applicable, you can use TCP health checks and define the port you wish.

The order that you define/add servers is the order they will be used. So if you have 100 Droplets set to work behind a load balancer, the LB will send traffic to the first server that responds (that'd be the default Round Robin option). You also have the option to use "Least Connections" which means that the server that has the least amount of traffic will be the one to receive the next request and so forth and so on.

Will They Work For You?

They can, but again, it really depends on how you need requests to be processed.

Load Balancers can handle traffic distribution, though how you'd set things up for what you're wanting to do really depends on how the processing is being handled.

Knowing a little more about how your wanting things to work, or how they are working would be very helpful in determining whether or not using a Load Balancer would be ideal, or if there's a better way to manage.

  • Thank you. We'd definitely be managing the number/size of droplets via API. After doing a bit of research, it'd be a good idea to use a message queue, like Gearman (PHP).

    Instead of relying on Gearman to load balance, could we simply use the DO load balancer's ip address in every message?

    • @stevetenuto

      Here's how a typical LB setup might look (very basic, please keep that in mind). I'll use the domain mydomain.com for the purpose of this example.

      - lb01.mydomain.com (Load Balancer)
      --------------------------------------------------
      -- img01.mydomain.com (Image Processing Server #01)
      -- img02.mydomain.com (Image Processing Server #02)
      -- img03.mydomain.com (Image Processing Server #03)
      -- img04.mydomain.com (Image Processing Server #04)
      -- img05.mydomain.com (Image Processing Server #05)
      -- img06.mydomain.com (Image Processing Server #06)
      

      Incoming Requests would be pointed to lb01.mydomain.com.

      For example, if mydomain.com is your domain, you'd point your DNS at lb01 and when a request is received it'll start routing.

      w/ Round Robin

      If img01 is up and responding, it'll take the next request. If it's still responding, it'll also be the one to process the next request and so on. Until img01 no longer responds to health checks, it'll keep being used. Should it fail a health check, img02 will be next in line and this will continue until either img01 comes back online, is replaced in the config with a new Droplet, or all Droplets are considered offline in which case it'd be a complete failure.

      w/ Least Connections

      If img01 is currently handling 10 requests but img02 is handling none, by process of which Droplet is currently serving the least amount of connections, img02 will take the next. This process bounces around based on request numbers. Whatever server is handling the least amount of requests will take on the next request. If a server fails to respond, it's skipped as it would be in Round Robin as well.

      Sticky Sessions

      If you want the same user making a request to be sent to the same Droplet for a defined period of time (default is 300 seconds, or 5 minutes), you can enable Sticky Sessions. The use-case here would be if you need session persistence or something similar.

      Ideally, if you are concerned about session persistence, you'd be far better off managing sessions using Memcached or Redis, but this is an option.

      What Should You Use?

      If you're familiar with Gearman and you're already using, and it's capable of routing without any major hurdles, I'd personally use it. This would put configuration in-app as opposed to outside it which may be more beneficial.

      If Gearman doesn't perform health checks, won't instantly re-route connections without the dreaded error page, thus requiring a refresh, etc -- then load balancing would be a better option.

      In either case, the Droplets would need the same script/setup/config on each of them, so that goes back to deploying with the API and either using a cloud-init script when you do deploy a new Droplet, or using another management tool to ensure that when a server is added to the load balancer, it's capable of handling a request.

Have another answer? Share your knowledge.