Tim Yocum, Lead Operations Engineer
It was early in 2014 when we noticed there were regular queries coming into support about DigitalOcean. Not just the "When will you support DigitalOcean?" from prospective customers, but "What performance can we expect if we use DigitalOcean with Compose" (or, as it was before we changed the name, MongoHQ). With DigitalOcean's reputation and mindshare continuing to rise we set out to answer those questions.
Although we could expect reasonable performance running with the current configuration, Compose is in the business of excellent performance so we knew the way forward was to bring our auto-scaling MongoDB platform to the DigitalOcean cloud. The first phase of that was ensuring that the platform was strong enough to support the loads that Compose places on it so we set out to test some droplets. DigitalOcean arranged for us to test on their Singapore data center which was in the process of rolling out as this assured us of access to their latest hypervisor code running – that was the only exception to us running default Droplets though.
We were already engaged in talking directly with the system administrators at DigitalOcean too; the kind of access which is usually very hard to get with cloud vendors. That meant we could compare notes on testing regimes and found that not only were they running many of the same tools as us, they actively wanted us to beat up the instances. They asked smart questions too and pointed us in the right direction to make the tests push their platform.
Working with the DigitalOcean team was a highlight of this deployment because they were so open to doing things the way we did things and they brought a depth of knowledge to the table which enabled us to ensure as we planned out the deployment, that we would be running our deployments on a diverse range of hardware.
We did our testing with fio to beat the I/O system up. We have scripts that scale I/O by a number of cores, so we hit it hard, and we ensured that our I/O testing accounts for caching and other commonly overlooked functionality of the modern Linux system that can cause generic throughput tests to provide inaccurate results. We back up our raw hardware testing with simulated real-world load tests using ycsb. Because we were in touch with the system admins we were able to let them know when we were going to really hit a system hard and they gave us feedback on what they were seeing. Beyond that, once we'd completed our testing we passed the logs over to DigitalOcean's team who ran the numbers and addressed the few times we'd noticed a sag in performance.
We were happy with the numbers from testing and the hypervisor code we'd been running with was already rolled out throughout the DigitalOcean cloud. That meant It was time to move onto production deployments.
March 26th, DigitalOcean New York. We provisioned a number of large Droplets at the New York presence, set up our SSH keys, and let our Chef serve up our auto-scaling MongoDB platform. We only needed to use the DigitalOcean UI to provision our droplets; with the version 2 release of DigitalOcean's API, we can easily integrate provisioning into our deployment systems. The new API is modern, very capable, and simpler than DigitalOcean's competition. We didn't tackle this integration first because the web UI is quite pleasant to use - only a few clicks to bring up systems perfectly tailored to our needs.
Back at the deployment, it all went smoothly. Once our Chef had done its work, we were ready to roll and accept more internal testing and our first alpha users. We began stressing the system to make sure it would be ready for a beta load and that came a month later.
April 30 was when we went public with the beta announcement of our auto-scaling MongoDB databases. We opened our doors to paying customers who wanted the automatic backups and oplog access for their DigitalOcean hosted applications. (Oplog access is a MongoDB feature that lets users plug into the database replication information. Web frameworks like Meteor.js use oplogs to allow them to produce horizontally scaled super-responsive applications based around a single MongoDB server installation.)
The beta period went really well, with lots of users take up as you can see in the graph, and no problems. After nearly two months, we were ready for the next step, and on June 25, we moved into full production mode.
Over the summer we continued to grow our user base in New York. There was one brief stumble when one of the systems started to lock up on boot when a bridge interface was configured, spiking the CPU. That was a problem no one had seen before and it was isolated to a single system. We got the full details of the platform issue of a systems engineer at DigitalOcean in a conference call and the system was back in action once we identified the problem that was triggering the spike. The brief episode didn't slow down our growth and pretty soon a new question was being asked: "When can we have your MongoDB service in our local DigitalOcean data centers?"
Requests for San Francisco and DigitalOcean's newly opened London data center had started coming in and the metrics we use to select new sites were tied on which would be most effective at getting Compose to as many users as possible. After a poll of users on our blog and thanks to DigitalOcean, we were able to open in both London and San Francisco. The deployment story was the same as our New York experience – simple and quick – and by August 12, we were ready to announce the new locations. But we weren't finished…
By the start of September, it became clear that it was time to add more nodes to cater to the expanding user base in New York. Again the DigitalOcean user interface made the process simple and quick. We decided to double our capacity in New York as we anticipate the continued rapid growth of DigitalOcean customers discovering they can focus on their applications and not on being database administrators by using Compose's database platform.
Talk to an expert for assistance with large deployments, migration planning or questions regarding a proof of concept.