So I’ve had quite a long back and forth with DO support about this. The final answer I got from them was:
“Based on the follow-up, it looks like this is a result of the consideration for what is being computed. The specific example outlined:”
cat /proc/meminfo
MemTotal: 4046440 kB
MemFree: 311132 kB
MemAvailable: 1705164 kB
Buffers: 500192 kB
Cached: 612276 kB
...
So given the output of /proc/meminfo above:
# (total - free - cached) / total
(4046440 - 311132 - 612276) / 4046440 = ~78%
Here is what they say about how they measure memory on the docs page:
https://www.digitalocean.com/docs/monitoring/resources/glossary-of-terms/#memory
This still differs quite significantly from the value of MemAvailable
(which - I think - is what most of us are looking at - it’s also what you will see if you run free -m):
(1705164/4046440) = ~42% avail = 58% used.
I don’t know what accounts for this difference, nor do I know which is the more reliable metric - though, from what I can gather it would appear that most people seem to look at memory available rather than doing the maths.
You can also interact with the DO agent via: /opt/digitalocean/bin/do-agent
e.g.: if you run: /opt/digitalocean/bin/do-agent --stdout-only
on a droplet with monitoring enabled it will dump the stats in Prometheus form to stdout.
In conclusion:
I am not sure that I trust the results that I am getting from DigitalOcean. Furthermore, I’m rather frustrated because these (potentially) misleading results have lead me to purchase more droplets than perhaps I needed.
Finally, I wrote an ansible script that might be useful to get the memory available across your droplets:
---
- hosts: ...
tasks:
- name: Get memory usage
shell: free -m
register: result
- debug: var=result
- debug: var=ansible_memory_mb
- debug: var="{{ansible_memory_mb.nocache.used/ansible_memory_mb.real.total*100}}"
At the end of the day, I ended up more confused than when I started. I can’t understand why there is such a significant difference between the calculation that they are doing and the results that I am getting. Nor do I know which metrics to trust. I’d greatly appreciate if someone smarter than me could explain this to me and help me determine an actionable metric!
Hello , is there any update related to this ? in my company we often get alerts from DigitalOcean about memory running high but when we check in the server we find it’s much more less and in the normal average…and because of the alerts that we often receive , we setup a script to restart the server automatically whenever the used memory is high , but we unfortunately still getting the alerts… so there is a mismatch for sure…please tell us in case of any update!