Question

DigitalOcean is calculating wrong the memory graph in monitoring?

Posted February 7, 2020 331 views
Linux BasicsMonitoringLinux Commands

Hello, I’m trying to fully understand the way DigitalOcean is calculating memory usage.

From the docs:
DigitalOcean calculates memory consumption by evaluating memory information exposed in /proc/meminfo.
On DigitalOcean, used memory is calculated by subtracting free memory and memory used for caching from the total memory amount.

If I expose the numbers on my droplet:

MemTotal:        3880364 kB
MemFree:          259960 kB
MemAvailable:    2647216 kB
Buffers:               0 kB
Cached:          1432548 kB
....
Slab:            1457128 kB
SReclaimable:    1371832 kB
....

The results are 3880364 MemTotal - (259960 MemFree + 1432548 Cached) = 2187856 kB
This is about 56% (the same number that is indicated on monitoring graph)

On the other side, if I compare this output with the free command:

              total        used        free      shared  buff/cache   available
Mem:        3880364      847548      297784      129948     2735032     2615692
Swap:             0           0           0

I see that the used memory is 847548 kB, so this would be around 21%.
You can also figure out that the value for buff/cache in free command is different from the /proc/meminfo command Cached value.

I think that DigitalOcean needs to also add the SReclaimable value from /proc/meminfo to the sum on the cache field.

SReclaimable: The part of the Slab that might be reclaimed (such as caches)

If we sum SReclaimable and Cached values we get: 2804380 kB. That’s pretty close to the buff/cache value on the free command.

After looking at some comparison charts on the Red Hat website Interpreting /proc/meminfo and free output for Red Hat Enterprise Linux 5, 6 and 7 I have found the match from the free and /proc/meminfo field:

free output coresponding /proc/meminfo fields
Mem: buff/cache Buffers + Cached + Slab

In summary, do you think that DigitalOcean needs to add the SReclaimable value?
And show memory usage as subtracting free memory and memory used for caching (Cached & SReclaimable) from the total memory amount?

Maybe I’m far away wrong, I’m no linux expert… What do you think?

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

×
4 answers

Hi Daniel,

This is an interesting question!

I think that this is already the case. For example, here’s my current memory usage according to the graphs:

Memory graph - 61%

It is currently at around 61.41%. It is a small Droplet and the total available RAM is 1006756KB.

So If we do 1006756KB - 61.42%, we would get about 388406KB available RAM. And then if I run the free command I get 382476 which is pretty close to the 388406KB value:

                 total        used        free      shared  buff/cache   available
Mem:        1006756      423508       80004        9936      503244      382476

Memory Current Usage

What is the current version of your do-agent? I could suggest upgrading it to the latest version, maybe there are some improvements in how the available RAM is calculated.

Also the do-agent is open source, if you are interested you could take a look at the source code here:

https://github.com/digitalocean/do-agent

If you believe that there is still something wrong with the calculation, I would suggest creating an issue in GitHub.

Regards,
Bobby

  • Hi @bobbyiliev, I’m using the last version of the agent do-agent-3.5.6-1.x86_64 so I don’t think that there is an issue.

    If you see my yesterday values and follow the same calculation you did above:

    The graph was indicating memory usage at 56% with total memory 3880364 kB.
    If we do 3880364 - 56% (2173003 kB), we would get about 1707361 kB of available memory. When I ran the free command I get 2615692 kB of available memory.

    In my case those numbers are not close…

                  total        used        free      shared  buff/cache   available
    Mem:        3880364      847548      297784      129948     2735032     2615692
    

    I have restarted the server now for other purpose, and If I run those calculations my value gets pretty close as yours, but I also see that the SReclaimable value is really small today. This is very different scenario from yesterday. So I still think there’s something we are missing here.

    Any other ideas?

    I think I’ll open an issue on GitHub when I see this values mismatch again.

    Thank you for your time and cooperation.

    • Hi @danielpcharrua,

      Yes indeed I agree, this is an interesting finding! As you mentioned I think that it is a good idea to open an issue on GitHub in case you see the same behavior again.

      In case this happens, if you could share the link to the GitHub issue here, it would be highly appreciated and it would help the community :)

      Regards,
      Bobby

Hi @bobbyiliev,

Same issue is happening again:

droplet graph

Graph is indicating 49.11% of memory usage so following your calculation, total - usage

3880364 kB - 1905646 kB (49.11%) = 1974718 kB (supposed available memory)

On the other hand I have /proc/meminfo output MemAvailable 2512568 kB and freeoutputs available 2512288 kB. So this has no logic to me.

When trying to open an issue in GitHub, this text appears:
Please only create a Github issue for bugs related to the code itself. If you are
experiencing an issue with sending metrics, display graphs, errors from the agent,
etc, please contact https://cloudsupport.digitalocean.com/s/ so we can provide support

So, I have opened a support ticket, the third one…

Also found an old question (2017) taking about the same issue of calculating memory usage. At first it was a bug, but there is people at 2019 still talking on that ticket.

I’ll post here any updates.
Thank you.

Got news, the answer from support this time is the real calculation they made:

Our memory calculation is as follows: (MemTotal - MemFree - Cache) / MemTotal

If I do that math, I get the exact number as the graph is showing. But, I don’t think the calculation is well done. See some information found: (Linux kernel source tree)

Many load balancing and workload placing programs check /proc/meminfo to
estimate how much free memory is available.  They generally do this by
adding up "free" and "cached", which was fine ten years ago, but is
pretty much guaranteed to be wrong today.

It is wrong because Cached includes memory that is not freeable as page
cache, for example shared memory segments, tmpfs, and ramfs, and it does
not include reclaimable slab memory, which can take up a large fraction
of system memory on mostly idle systems with lots of files.

Currently, the amount of memory that is available for a new workload,
without pushing the system into swap, can be estimated from MemFree,
Active(file), Inactive(file), and SReclaimable, as well as the "low"
watermarks from /proc/zoneinfo.

However, this may change in the future, and user space really should not
be expected to know kernel internals to come up with an estimate for the
amount of free memory.

It is more convenient to provide such an estimate in /proc/meminfo.  If
things change in the future, we only have to change it in one place.

MemAvailable was born

I think that the calculation would be: (MemTotal - MemAvailable) / MemTotal

What do you think?

Folks, if you agree with this please upvote the idea here:

https://ideas.digitalocean.com/ideas/CPX-I-17

Submit an Answer