Iouri Kostine, Partner
Last year, our team at The Able Few had been put on a mission to discover a hosting provider that would allow us to continue pushing boundaries for our clients. We needed to store 1.5 billion documents while allowing for a rate of growth of around 2 million documents per day. That amounts to 6TB of data, growing by the minute, that all must be retrievable in under half a second. In this particular case we were tracking social media, consuming messages from the Twitter firehose, Facebook posts and various other providers. Maintaining the infrastructure to handle massive spikes in volume (imagine the Super Bowl or Grammy's, for example) meant that we needed to be able to instantly add storage and processing power to our cluster.
At that time we were using Rackspace, our costs were astronomical and each new node required a lengthy contract. Because of our storage requirements, their cloud offering was not an option and the 10k rpm HDD drives were unable to keep up with the massive amount of data we were processing and analyzing.
Our initial test was a five node cluster, running Storm and Elasticsearch. The results were amazing. The test cluster outperformed our twelve node cluster at Rackspace and was able to process 23 times the amount of messages per second because of the I/O performance of SSD drives. Not only did we see a bump in performance, but we had more storage available per node than at Rackspace at a third of the cost. Switching to DigitalOcean allowed us to keep up with the pace of changing technology, where we could spin up droplets as needed. This was great news to our client and allowed us flexibility to explore technology that would have otherwise been cost prohibitive and slow to implement.
With such impressive results Fizziology, our client, and The Able Few launched what we call Centrifuge at DigitalOcean. We maintain a rolling window of 24 months of raw data, which is indexed and processed in realtime. This data is the lifeblood of Fizziology and so one of our primary concerns are data redundancy and failover solutions. What happens if a droplet fails? What happens if five droplets fail? How do we scale our cluster to keep up with the steady increase in social media usage?
If we have issues with a droplet or need to take a few droplets down for maintenance, the cluster repairs itself by moving replicas to the remaining droplets almost instantaneously. Data replicas are distributed across our cluster so that every message exists on at least two droplets at any given time. New nodes can be added and join the cluster within minutes using DigitalOcean Images via the API.
We utilize a host of opensource and custom technology to provide Fizziology the backbone to work the impressive skill set they bring to the entertainment industry.
Our vision at The Able Few is to push ourselves to create products that seem impossible. It's one of the things that we take great pride in and allows us to keep learning and growing as individuals and as a company. We take our client's challenges and treat them as if they were our own. That often means being budget conscious and optimizing the performance of technology to it's max potential tenets that DigitalOcean's offerings align with perfectly. The team at DigitalOcean is responsive and seems to genuinely care about the challenges we face. DigitalOcean is providing us with infrastructure from which we can build dreams and ideas.
Contact our Customer Success team to get answers.