In a previous blog post, I took a look at how to enumerate all the syscalls and even their arguments using tools such as eBPF. That left me pondering and craving to learn more about how memory is mapped and what do simple variables look like in the memory. What is behind all those memory addresses you can see in the stack traces?
I do have an intuitive sense of that. Sure, I have seen blog posts and talks about the topic, taken a look at heap dumps in a hunt for memory leaks but I wonder does it make any sense to look at the memory in a language/runtime agnostic manner. Probably not, but hey, could be exciting.
To find out, I created a simple program that simply prints out the contents of a few variables
I try to make the outputs depend on the runtime environment to avoid any unexpected compiler optimizations. I want to make sure the memory will be allocated at runtime.
I run the code in my trusty Digitalocean VM with "no hang-up" and attach the GNU Project debugger (GDB) while the program is asleep to take a core dump. It seems to be one of the easiest ways to read the program state from memory. Sure, I could use no tools to copy and parse the virtual memory files under /proc/PID/maps yet GDB is more straightforward and is not really anything that runtime specific. It's no pprof.
While the gomemory app is waiting in the background, I run GDB to obtain a core dump. The go program outputs the variables and their addresses so I can try to find those from the core dump |
I can examine the core file with the GDB CLI tools or alternatively I could hexdump it. The CLI way is easier though and I can reliably find the memory contents. I don't have to wrap my head around any endianness confusion.
List of files inside the core dump and the executed binary |
I can access a specific memory address with the X <address> command. The go program output shows the addresses so it is easy to see their contents. Let's take a look at the memory locations which the program printed to the console.
Memory address and its contents |
Cool! At first glance, the first variable does not make any sense, so it must be a reference to some other location. The same goes for the second one which was supposed to contain a string /home/juho/memexp. 1a044 is no sensible ASCII code, it is not a text string so once again it is a pointer to another location. The last one is the actual contents of the variable. 51 in hex is 81 in decimal so the memory location indeed contains the random integer generated by the program.
So what are those referenced addresses? I can use GDB to inspect the memory addresses and even read a range of addresses but I decided, after all, to take a hex dump from the virtual memory map. I ran the app again and took a hex dump of the address space file under /proc/PID/maps from the 0xc0 to 0xc4 range. There we can find the 1a040 address.
So it refers to the PWD env var, interesting!
How about that second one?
Indeed that referenced address contains the printed string. It looks like it is not in the stack frame memory but resides in the heap. The go compiler must have decided it is supposed to be there.
What did this exercise help me understand? Well, firstly I would not try to debug any real-life go application like this. If the only thing I had from a go app is a core dump, I would look into something like delve. That was something I knew already though. Secondly, it is pretty easy actually to read virtual memory. After all, everything is just a file.
Comments
Post a Comment