DigitalOcean Monitoring uses a variety of metrics to track system health. We will go through the different resources, the units used to measure them, and the way they can be used by DigitalOcean Monitoring.
CPU utilization measures the amount of processor being used at a given time. CPU utilization is expressed as a percentage.
On DigitalOcean, total use of all processors combined is indicated by 100%. This differs from some CPU usage tools which report 100% per CPU or core. For example, other tools might express metrics out of 200% on a machine with two CPUs, or 400% for a quad-core processor.
In the Droplet graphs, CPU usage is broken down in terms of Linux’s conception of system and user time. System time is time spent executing kernel-level instructions, while user time is time spent executing “userland” instructions, which is defined by anything outside of the kernel.
Alert policies do not distinguish between user and system time.
Memory utilization is a measurement of the memory being consumed on the server. This is expressed as a percentage of the total available physical memory:
DigitalOcean calculates memory consumption by evaluating memory information exposed in
/proc/meminfo. Memory usage is calculated by subtracting free memory and memory used for caching from the total memory amount.
Disk I/O, or input/output, is a measure of how much read and write activity the server’s disks are experiencing. This is expressed in terms of MB/s, or megabytes per second.
DigitalOcean breaks disk I/O down into read and write operations, which are handled separately. Droplet graphs show these as two separate lines within the Disk I/O graph:
Separate alert policies can be created to monitor disk read operations and disk write operations.
Disk usage is a measurement of how much disk space is currently being used. This is expressed as a percentage of the total disk space available on the server.
This value takes into account the Droplet’s root storage and any additional attached block storage devices. The values of each storage device are rolled up into a single value that represents the total storage space of the server:
Alert policies are also interpreted in terms of total disk space.
Bandwidth is a measurement of the amount of incoming or outgoing traffic passing through the Droplet’s network interfaces. This is expressed in terms of MBps, or Megabytes per second.
In Droplet graphs, bandwidth is broken down between public and private traffic. Public bandwidth is bandwidth over the public interface that connects to the internet. Incoming traffic is represented by one line, and outgoing traffic by another.
Private bandwidth is a measure of the traffic on the private interface that allows for communication within a datacenter. This graph will only be displayed if private networking is enabled and the interface has experienced traffic. Again, there are separate lines for incoming and outgoing traffic.
In alert policies, there is no distinction between public and private interfaces, but the separation of inbound and outbound traffic remains. An alert policy can track incoming traffic or outgoing traffic. Alerts policies are also defined in terms of MBps.
DigitalOcean also reports the highest consumers of CPU and memory as a chart within Droplet graphs. The processes are sorted with the highest consumer of the selected resource first. Each process is accompanied by a usage percentage out of the total available resources.
The top CPU users:
The top memory users:
These charts don’t have much impact on the alert policies, though they may be able to provide insight into what processes may have contributed to triggering an alert.
When working with monitoring technology, some familiarity with common terminology is often helpful. Below, we will cover some of the most frequently used concepts that are relevant to DigitalOcean Monitoring: