Question

Migrating to a Scalable Wordpress Solution

Posted July 11, 2018 6.7k views
Nginx Ubuntu MySQL WordPress Docker Chef Block Storage Terraform

TLDR; I want to manage and configure multiple LEMP stacks with Terraform and Chef/Ansible to host WordPress sites without losing the data when a droplet is destroyed.

Here are the questions that I am looking to have answered:

  • Is Terraform and Chef/Ansible the right direction to move in to help me manage multiple WordPress sites?
  • Is it better to update NGINX configuration by destroying and rebuilding a droplet with Terraform, or is it better to change the file across all servers with Chef/Ansible
  • If I use Terraform to configure NGINX, PHP, WordPress Config Files, etc… How do I secure the WordPress database and Uploads data so that when a Droplet is destroyed by Terraform, the site remains live without interruption
  • Is Block Storage or GlusterFS a better solution for managing the Data of 100’s of different WordPress sites
  • Is Chef or Ansible better for this purpose

Here is the context for those questions:
I design and host WordPress websites in my area. For a lack of a deeper understanding of automation, I have been manually deploying the sites on DigitalOcean droplets and configuring them each independently. The only difference between the servers is the WordPress database and Uploads data, which I feel justifies the move to automate deployment through Terraform. I guess what I am looking for is a critical ear to interpret my plan and guide me in the right direction.

The plan is to set up a centralized management server which will host Terraform for infrastructure and docker automation, Chef (or Ansible) for configuration management, and an OpenVPN server for channeling secure access to WordPress servers. All servers are built on a LEMP stack and have carefully secured and optimized NGINX configurations.

The goal is to set it up so that when a client request a website designed for them I can easily spin up a new droplet through Terraform which containerizes the LEMP stack and configures everything (like NGINX files) with Chef or Ansible. I would also like to be able to update server files in one place and have it update across all servers. This is where I could really use some guidance.

Does it make sense to update configuration files (like NGINX or PHP) with Chef/Ansible or does it make sense to rebuild the droplets with Terraform so there is no configuration drift. If I do it with Terraform then the next problem that presents itself is figuring out a way to keep WordPress sites and data live when Terraform destroys a droplet. There are some pretty good tutorials on here about setting up GlusterFS or block storage, so I could potentially host all of the WordPress data on separate servers and use Terraform to manage the processing servers.

I appreciate the community taking the time to read this and any feedback or guidance would be greatly appreciated.

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

2 answers

(Some mistakes were made in this reply, see comments that follow it)

Hello friend!

That sounds like a lot of fun any way you spin it. I’m going to do my best to provide answers to what I can, and hope that others feel compelled to weigh in on it as well. My perspective alone will not be enough to satisfy all of the advice that you would like, but I still have some thoughts I’d like to share.

My position on Terraform, Chef, and Ansible is “maybe” on all points. I think someone who uses them more intensively than I do might have reasons to lean in a more clear direction. I hope someone will see the opening I’m leaving there and run with it.

What I think I have the most insight on is this:

  • Making individual servers irrelevant, and their destruction non-destructive

Here’s what I’m thinking, just as a rough idea:

  • 2x MySQL with master-master replication behind HAProxy
  • 2x LEMP stacks with data in sync via GlusterFS, both behind HAProxy (or our Load Balancer service)
  • Automation which spins up a new server, adds it to GlusterFS cluster, connects to MySQL HAProxy instance, then adds it to the load balancer serving the LEMP stacks

You can use our block storage for the data stores, but the volume will only be mounted to one droplet at a time. You would end up needing a volume for each stack with something like GlusterFS keeping the volumes in sync on each of those droplets. Destroying one droplet would inevitably make it’s attached volume fall out of sync with the rest, requiring a fresh sync with a new droplet. Unless you need the additional storage, this seems to increase complexity for no particular gain by using storage volumes.

Now one might imagine having 2x data stores in the cluster and mounting them over network to multiple servers at once. In my experience, this is more functional in theory than in practice, and I have never personally found a protocol that satisfies for me the theory of how great that could potentially be. The most functional implementation I have had on that path has been a davfs docker volume.

I hope that at least helps you with some considerations on this :)

Kind Regards,
Jarland

  • Hey Jarland!

    Thanks for taking the time to write such a thoughtful response, I hope it serves to help many others with their questions regarding similar subjects! :D

    I have to admit I’m a bit confused about how the LEMP stacks and GlusterFS play together in this configuration to facilitate, as you really accurately put it “non-destructive destruction.”

    From what I understand:

    • When a user visits one of the client’s websites the load balancer serves them with the most available of the two LEMP stacks.
    • The LEMP stacks then are configured in a multi-site setup to serve the WordPress files from the GlusterFS cluster and the most available MYSQL database.

    Where I fall short of understanding is how the automated servers play a role in this with GlusterFS and how configuration management for NGINX, PHP, etc would be handled in this environment.

    Do the servers that are deployed through automation each host their own webserver for serving the client’s website through the load balancer, or are they simply nodes which add more storage to the GlusterFS cluster.

    I really appreciate the response you’ve given me It has put me on a better course than I was going before!

    Cheers,
    Curtis

    • Glad I can at least help some! So about that GlusterFS, here’s kind of a different outline with more focus on that, and a course correction from my previous reply:

      Top level: Load Balancer
      Under the LB:

      • LEMP server with /var/www set up as GlusterFS directory
      • LEMP server with /var/www set up as GlusterFS directory

      Under another LB:

      • MySQL master
      • MySQL secondary master (master-master replication)

      Outside of the LB:

      • Storage server with /sync set up as GlusterFS directory
      • Storage backup server with /sync set up as GlusterFS directory

      The LEMP servers would be connected together with the storage GlusterFS cluster, and then perhaps something like a chef script would add a new LEMP server into the cluster by pushing your current Nginx configuration to it, and then adding itself to the GlusterFS storage group. The exact steps to that are a bit relative, but something similar to this:

      https://www.digitalocean.com/community/tutorials/how-to-create-a-redundant-storage-pool-using-glusterfs-on-ubuntu-servers

      Now when I wrote my previous reply, I was thinking in terms of GlusterFS being a multiple direction sync, not being a remote mount with sync. You’ll have to forgive me here, I haven’t used Gluster in some time and my memory failed me for a bit. It is a remote mount. This kind of nullifies my original thought that a block storage volume would only add confusion, it’s not that bad when each LEMP server doesn’t need it’s own. You were thinking more in the right direction than I was there, apologies. I’m going to make it a point to build a GlusterFS cluster again this weekend as a refresher.

      One key thing to note that my memory does serve me well on: GlusterFS is not fast. I really wouldn’t do geographic redundancy here, Gluster with high latency can be rough. I also would not use it if you expect constant writes and need the sync to have absolutely no delay. A GlusterFS sync should be fine for someone writing a blog post every day with a few pictures, bad for someone who automates a thousand posts per hour with 1500 pictures added (I’ve seen this, it’s not pretty).

      by Justin Ellingwood
      GlusterFS is a technology that allows you to create pools of storage that are accessible from the network. Using this software, in this article we will discuss how to create redundant storage across multiple servers in order to ensure that your data is available regardless of if one server goes down.
      • I guess what’s really confusing me is whether the 2 LEMP servers you mentioned would be set up as the primary webservers for all of the clients, or whether that was just an example with two clients, and each webserver would be hosted on it’s own LEMP stack.

        Let’s say there are 50 different websites each with their own unique content. There would of course be the load balanced MYSQL servers where the WordPress databases are stored for each site. What would the rest of the infrastructure look like in terms of LEMP stacks and storage servers?

        1. Would there be 50 servers which each host their own LEMP stack and serve as a member of the GlusterFS storage cluster
        2. Or would there be 2 identical LEMP servers each with 50 server blocks and a cluster of storage servers which scale to address space needs. (similar to this here)

        In the first example each client/site is dedicated their own server and the infrastructure is scaled on the individual client level. I’m not exactly confident how storage works in this configuration.

        In the second example there are two primary webservers which handle all of the request and a cluster of GlusterFS nodes (or two servers with redundant block storage attached) which store the data. In this configuration the infrastructure is divided into compute servers and storage servers which are scaled based on the overall usage of all clients.

        Both of them have their benefits, but I’m curious which one you are referring to.

        Again, thank you so much for the help you have already given me and I would love to hear about how setting up GlusterFS turns out for you :D

        Cheers,
        Curtis

Have you considered Docker at all? Its pretty awesome and solves your specific use case pretty nicely. I also host quite a few wordpress sites + a number of more complex wordpress installs that utilize things like decoupled react frontends via rest api. We also manage hosting for fortune 500 clients and having a site go down or losing data is simply not an option. It could cost some of our clients hundreds of thousands if not millions of dollars if their site went down for a day or we lost any data. We set up Wordpress to operate as closely to a 12 factor application as possible (https://12factor.net).

We also utilize docker in our Development which allows us to have 100% parity between our development and production environments. When adding things like plugins/themes we always add those to the filesystem locally and build them into the docker image rather than uploading via the wp-admin. Plugin & theme updates and additions are all handled in our local dev environment and fully tested before pushing to staging and eventually production. We generally disable plugin/theme install and updating inside the wp-admin as well.

Images / Videos / etc are offloaded to Digital Ocean’s New spaces storage using the iLab media cloud plugin. Historically we always used Amazon S3 for this but that can get pretty expensive. Spaces + a good CDN like imgix or DO’s upcoming Spaces CDN functionality is all you need to make sure uploads aren’t lost as a result of a server or container going down and you get the added benefit of globally available super fast image loading.

We currently utilize a set of external mysql servers setup with master/master replication behind a load balancer just as Jarland does above. We also have an internal project that uses a MariaDB Cluster within the Docker environment but aren’t 100% keen on running production Databases within docker just yet.

Rancher has been our docker orchestration platform of choice up to this point. Its a pretty incredible system that handles a lot of the tedious devops stuff and is really easy to setup and use. The ability to add custom “Application Stacks or Templates” within the catalog that can be launched quickly by developers without devops knowledge. Rancher also has a full rest api which allows you to literally automate everything.

Digital Ocean’s upcoming Kubernetes release will make docker container management even easier and will be able to fulfill your requests Out of the box as it will integrate DO’s object storage, spaces, load balancers, DNS, etc for a complete managed experience. You’ll simply launch your apps via Kubernetes YAML files and they will handle the management of the Kubernetes cluster. Much easier than messing with chef, ansible, glusterfs, etc. You also get the added benefit of being able to quickly launch highly available apps with Letsencrypt ssl certs, monitoring, and more across $5 DO droplets and autoscale as traffic / server load increases. You can literally have a $15 3 x node cluster with this method. The ability to both automate all the things and manage your costs more granularly are the driving factors behind our Agency moving to Docker almost exclusively.

  • Hey blakeevanmoor!

    Thank you for your response.

    I didn’t consider docker for configuration management, but after you mentioned it I researched into it a bit more and it looks like a promising way to at least add a better level of security and assurance to the droplet management process.

    Configuration automation was seeming like the route to go for managing 100’s of different WordPress sites, but the more I think about it, you can’t really do server-wide configuration updates for NGINX, PHP, or WordPress because each site has their own unique plugins/themes which depend on certain versions of PHP or WordPress to function properly. If there is some way to get around this by something like containerizing NGINX and PHP separately I would be super excited to hear it :D

    It’s making more sense to use Terraform to manage the deployment of droplets provisioned with firewall rules, user credentials, docker, and a LEMP stack, and then use docker to containerize the LEMP stack for each individual WordPress site.

    In this configuration each website would have it’s own droplet with a containerized LEMP stack where MYSQL is served by load-balanced master-master MariaDB servers. I would still have to update PHP/NGINX configurations on a per-site basis but at least the server setup is automated through Terraform, and there is 100% parity between environments to minimize the risk of downtime.

    I’m not sure this is the most optimal and secure way to do it, but my limited understanding of docker is the bottleneck in limiting my ability to find a highly available, redundant, and easily scalable version of this.

    • Apologies for not seeing your response sooner. Now more than a year later I’m sure you’ve managed to get things worked out but wanted to respond even if to provide a future visitor additional information. Terraform is absolutely a great solution to the problem as well. We’ve only recently (late 2019) started to begin trusting K8s to handle more stateful workloads but even still manage our Wordpress databases & Media uploads using externally managed services (AWS Aurora DB + S3 Buckets and/or DO Spaces and DO Managed DB depending on the client). I don’t think Docker is really the bottleneck if used properly. You can absolutely have a granular docker setup where PHP and Nginx are separate instances even from the WP container and every site can have a completely different PHP and Nginx versions and configurations. You can also use things like Helm charts and Kubernetes Secrets to automate configuration much the same way you can with Terraform though you are orchestrating containers rather than VPS instances. We also leverage Rancher so we can manage multi-cluster applications across various cloud providers easily while also gaining the ability to have things like node and cluster templates, really granular user controls and ability to automate deployment very similar to how Terraform handles it. Either way as long as your goals for uptime, ease of management and costs are met it really doesn’t matter what path you take. Lots of valid ways to run and orchestrate WP sites :)

  • Hello! This reply is many months behind the topic however your comment is exactly what I’m looking for.
    Would you care to share the Kubernetes deployment file with me? Albeit removing the secrets etc?
    Do you not use an NFS share at all to keep the web servers in check? Do all web servers point to the DO Space?
    What happens with the files uploaded, they surely land on a non-persistent volume first before being uploaded to Spaces?

    • Sorry for the extremely late reply. The DO Community forums don’t seem to be emailing me notices even with the “Notify me of replies” checked. I’m actually in the process of a complete overhaul of our Kubernetes infrastructure as so much has changed over the last year. In the past, we simply leveraged sessions at the LoadBalancer so that when working in the wp-admin you’d be targeting a single container instance w/ that session.

      Because we only use plugins and themes baked into our custom WP Docker image and leverage an externally managed database cluster we only have to worry about media uploads in terms of storage and for that we leveraged the s3-uploads plugin by Humanmade (there are other great ones like Delicious Brain’s Offload Media plugin) to push files to AWS S3 and then delete the local file from the container. This, unfortunately, does sometimes result in some weird issues where files will fail to upload to s3 and revert to try to use the local upload and because its a clustered env would only be available when a user happens to hit the node containing that file. While super rare its def been a headache and led us to explore other options like shared storage. We’ve tested a half dozen or so options and narrowed our choices down to 3 solutions. StorageOS (free up to 500gb storage pool), S3FS Sidecar for mounting a shared DO Space/s3 bucket to each wp container or a Rook Ceph Setup.

      Each solution has its strengths and weaknesses. StorageOS is only free to a point then costs roughly $30/tb of storage from what I’ve heard. For me, that seems reasonable and may end up being the way we go as performance so far seems great and setup was super easy. The S3FS Sidecar is great from a cost and ease of use perspective. Being able to upload directly to a DO space/ S3bucket without the middle-man upload to the worker node’s filesystem seems ideal in terms of simplifying complexity and its free OSS that’s pretty easy to implement and pretty much a configure once and forget about it system when setting up properly. Unfortunately, it seems to be pretty slow especially when trying to upload a lot of media files via the media library in wp-admin. I’d thought skipping the upload to the local disc would improve times and maybe it does to a point… but the perceived experience is slower. My best guess is when uploading to local disc first with a plugin the media library’s progress indicator is indicating based on that initial upload not the final background upload to the S3 bucket. So while overall time to get the file to destination s3 bucket may be quicker with the S3FS mounted filesystem it will appear slower as direct to local disc vs s3fs is faster. The last option Rook Ceph seems to be a nice balance of power, performance & Cost (Free) but we’ve had some issues getting it to function reliably. Just weird issues kept popping up and our ultimate goal with our infrastructure is automation and decrease of complexity.

      I’m thinking StorageOS is what we’ll ultimately end up using in addition to our own custom drop-in S3 Offload plugin or existing S3-Uploads setup. Additionally, we are working on a set of custom helm charts (Production + Development) and a custom lightweight caching plugin that doesn’t actually handle caching but instead handles intelligently clearing various caches (Cloudflare, Nginx, Redis) rather than requiring 3 separate plugins to handle that.

      Once we get everything dialed in we plan on sharing a public repo that includes the Helm files and some of our custom plugins. Unfortunately, because much of our work is built to solve things specific to our setup, not everything can get open-sourced but we fully intend to put something out that will help others get a powerful wp setup going on Kubernetes while simplifying much of the complex stuff like storage and caching at the cluster or container level. I hope to have something ready to share at some point in Q1 of 2020 :)

Submit an Answer