Skip to main content

Dive into a go program memory with GDB

In a previous blog post, I took a look at how to enumerate all the syscalls and even their arguments using tools such as eBPF. That left me pondering and craving to learn more about how memory is mapped and what do simple variables look like in the memory. What is behind all those memory addresses you can see in the stack traces?

I do have an intuitive sense of that. Sure, I have seen blog posts and talks about the topic, taken a look at heap dumps in a hunt for memory leaks but I wonder does it make any sense to look at the memory in a language/runtime agnostic manner. Probably not, but hey, could be exciting.

To find out, I created a simple program that simply prints out the contents of a few variables 

I try to make the outputs depend on the runtime environment to avoid any unexpected compiler optimizations. I want to make sure the memory will be allocated at runtime.

I run the code in my trusty Digitalocean VM with "no hang-up" and attach the GNU Project debugger (GDB) while the program is asleep to take a core dump. It seems to be one of the easiest ways to read the program state from memory. Sure, I could use no tools to copy and parse the virtual memory files under /proc/PID/maps yet GDB is more straightforward and is not really anything that runtime specific. It's no pprof.

While the gomemory app is waiting in the background, I run GDB to obtain a core dump. The go program outputs the variables and their addresses so I can try to find those from the core dump

I can examine the core file with the GDB CLI tools or alternatively I could hexdump it. The CLI way is easier though and I can reliably find the memory contents. I don't have to wrap my head around any endianness confusion.

List of files inside the core dump and the executed binary

I can access a specific memory address with the X <address> command. The go program output shows the addresses so it is easy to see their contents. Let's take a look at the memory locations which the program printed to the console.

Memory address and its contents

Cool! At first glance, the first variable does not make any sense, so it must be a reference to some other location. The same goes for the second one which was supposed to contain a string /home/juho/memexp. 1a044 is no sensible ASCII code, it is not a text string so once again it is a pointer to another location. The last one is the actual contents of the variable. 51 in hex is 81 in decimal so the memory location indeed contains the random integer generated by the program.

So what are those referenced addresses? I can use GDB to inspect the memory addresses and even read a range of addresses but I decided, after all, to take a hex dump from the virtual memory map. I ran the app again and took a hex dump of the address space file under /proc/PID/maps from the 0xc0 to 0xc4 range. There we can find the 1a040 address.


So it refers to the PWD env var, interesting!

How about that second one?


Indeed that referenced address contains the printed string. It looks like it is not in the stack frame memory but resides in the heap. The go compiler must have decided it is supposed to be there.

What did this exercise help me understand? Well, firstly I would not try to debug any real-life go application like this. If the only thing I had from a go app is a core dump, I would look into something like delve. That was something I knew already though. Secondly, it is pretty easy actually to read virtual memory. After all, everything is just a file.



Comments

Popular posts from this blog

I'm not a passionate developer

A family friend of mine is an airlane pilot. A dream job for most, right? As a child, I certainly thought so. Now that I can have grown-up talks with him, I have discovered a more accurate description of his profession. He says that the truth about the job is that it is boring. To me, that is not that surprising. Airplanes are cool and all, but when you are in the middle of the Atlantic sitting next to the colleague you have been talking to past five years, how stimulating can that be? When he says the job is boring, it is not a bad kind of boring. It is a very specific boring. The "boring" you would want as a passenger. Uneventful.  Yet, he loves his job. According to him, an experienced pilot is most pleased when each and every tiny thing in the flight plan - goes according to plan. Passengers in the cabin of an expert pilot sit in the comfort of not even noticing who is flying. As someone employed in a field where being boring is not exactly in high demand, this sounds pro

Extracting object properties from an IFC file with IfcOpenShell

Besides the object geometry information, IFC files may contain properties for the IFC objects. The properties can be, for example, some predefined dimension information such as an object volume or a choice of material. Some of the properties are predefined in the IFC standards, but custom ones can be added. IFC files can be massive and resource-intensive to process, so in some cases, it helps to separate the object properties from the geometry data. IfcOpenShell  is a toolset for processing IFC files. It is written mostly in C++ but also provides a Python interface. To read an IFC file >>> ifc_file = ifcopenshell.open("model.ifc") Fetch all objects of type IfcSlab >>> slab = ifc_file.by_type("IfcSlab")[1] Get the list of properties >>> slab.IsDefinedBy (#145075=IfcRelDefinesByType('2_fok0__fAcBZmMlQcYwie',#1,$,$,(#27,#59),#145074), #145140=IfcRelDefinesByProperties('3U2LyORgXC2f_hWf6I16C1',#1,$,$,(#27,#59),#145141), #145142

Hubristic developer

Almost half of any Finnish generation goes through a shared experience: the conscript army. An integral part of that experience is learning military slang, a set way people in the army talk. The stories told with said jargon often spread outside of the barracks. It is not uncommon to hear strangers bonding together over beers reminiscing and feeling nostalgic about freezing cold nights spent in tents. There is a similar phenomenon detectable among us coders. To be part of the coder tribe, there is at least one type of story that one must master. That is - of course - ranting about legacy codebases. "Can you believe how much of a mess the previous coders left? Hear, hear!" There is no better way to onboard a new team member than to blame some previous B-team for all the murky parts of the system at hand. This can be seen as harmless, a subject for a good  meme . Rarely do we hold real grudges against "the legacy folk" and can be the best of friends in a social gather