Every so often, I get fooled by the scary-looking rising memory graphs on my puny minimum sized VM running on Digital Ocean. The same thing seems to happen to my colleagues and I have to go into this discussion on memory usage on virtual machines and on work laptops. I'm not an expert on the topic but I have gathered here a couple of things to demystify the issue.
The first question I ask myself or a fellow developer is: what do you mean by running out of memory? You are most likely talking about memory pressure, right? What are the symptoms? Is it just a gut feeling or is there some verifiable performance degradation?
After that, we could talk about a few seemingly counter-intuitive topics
- The "raw" memory usage should be high. Especially on a busy VM, it is good that frequently used files are mapped to memory.
- The OS does not know when it is out of memory. The Linux kernel allocates memory until it can't.
- You probably should have swap on
Before one goes digging into the system, a few concepts need to be at least somewhat familiar to the user here.
- RSS
- Virtual memory
- Cache, buffers == page cache
- Memory pages
Do the usual stuff with free -h and check the buffered and cached sizes. Note that you might want a lot of files cached. This might be something that could cause performance issues. A lot of the cached files could be "freed" due to high memory pressure. A better option is to look vmstat -a 1 and check the inactive vs active memory ratio. If there is little inactive memory compared to the active, could be a signal of memory pressure.
vmstat -aw 1 |
A more detailed output can be fetched from cat /proc/meminfo
What is interesting here is the AnonPages (or Active(anon)). In a gross oversimplification, this means all the memory your applications are using in their heap. The anonymous means it is not backed by a file so it was for example created with malloc.
You could also look into the running processes with pmap -x <pid> and check the heap size and files mapped to memory
For example in the java application pmap above, the heap size is the dominating part, mapped to anonymous memory. To dig in more deeply to the heap, a more specialized tool is needed, like a heap dump analyzer.
Using these few tools will give you the data but the hard part obviously is how to interpret that. I don't have any answers to give here on an easy metric to determine if you are running out of memory or rather is the lack of memory causing performance problems. But what I can tell you is that it is a harder issue than one might think.
Much of this blogpost is based on Brendan Gregg's Systems Performance book http://www.brendangregg.com/sysperfbook.html
Comments
Post a Comment