At DigitalOcean, we’ve been rapidly adding new products and features on our mission to simplify cloud computing, and today we're happy to announce our latest enhancement.
Over the first half of 2018, we've improved performance for Block Storage Volumes with backend upgrades that reduce cluster latency by 50% and provide new burst support for higher performance for spiky workloads.
Block Storage Volumes have a wide variety of use cases, like database reads and writes as well as storing logs, static assets, backups, and more. The performance expectations from a particular volume will depend on how it's used.
Database workloads, for example, need single-digit millisecond latency. Most workloads in the cloud today are bursty, however, and don't require sustained high performance at all times. Use cases like web servers, backups, and data warehousing can require higher performance due to short increases in traffic or a temporary need for more bandwidth.
To meet the need for very low latency, we upgraded Ceph to its latest version, Luminous v12.2.2, in all regions containing Block Storage. This reduced our cluster latency by 50% and provides the infrastructure you need to manage databases with Block Storage Volumes.
To support spiky workloads, we added burst support, which automatically increases Block Storage Volumes' IOPS and bandwidth rates for short periods of time (60 seconds) before returning to baseline performance to cool off (60 seconds).
Here's a summary of the burst performance characteristics, which compares a Standard Droplet (SD) plan and an Optimized Droplet (OD) plan:
We don't scale performance by the size of the volume you create, so every Block Storage Volume is configured to provide the same level of performance for your applications. However, your application needs to be written to realize these limits, and the kind of performance you get will depend on your app's configuration and a number of other parameters.
To learn more about the performance you're getting, we wrote How To Benchmark DigitalOcean Volumes, which explains not only how to benchmark your volumes but also how to interpret the results.
We then ran some of these tests internally to share the numbers and performance of our offering. You can find all the details in the tutorial, but here's a sample of results, which shows typical performance based on the queue depth (QD) of the application and the block size (on the x-axis) versus IOPS (on the y-axis).
These graphs show that the IOPS rate increases as queue depth increases until we hit our practical IOPS cap. Smaller block sizes tend to be IOPS limited, while larger block sizes tend to be bandwidth limited.
What about latency? Most real-world customer applications won't run the same kind of workload often used as a baseline (QD = 1 4K I/O), so these graphs show latency in µsec (or microseconds) as we add load to the cluster.
We see the same behavior in reads and writes. Because of how the backend storage stores the data, our results show that 16K has better latency at high queue depth, so we recommend you tune for 16K workloads if possible.
The performance improvements aren’t the only thing we have in store. There are several QoS features and infrastructure investments in the pipeline to improve your experience of Block Storage Volumes. (Ready to get started? Create a Volume now.)
We'd love to hear your thoughts, questions, and feedback. Feel free to leave a comment here or reach out to us through our UserVoice.