Skip to main content

Dive into a go program memory with GDB

In a previous blog post, I took a look at how to enumerate all the syscalls and even their arguments using tools such as eBPF. That left me pondering and craving to learn more about how memory is mapped and what do simple variables look like in the memory. What is behind all those memory addresses you can see in the stack traces?

I do have an intuitive sense of that. Sure, I have seen blog posts and talks about the topic, taken a look at heap dumps in a hunt for memory leaks but I wonder does it make any sense to look at the memory in a language/runtime agnostic manner. Probably not, but hey, could be exciting.

To find out, I created a simple program that simply prints out the contents of a few variables 

I try to make the outputs depend on the runtime environment to avoid any unexpected compiler optimizations. I want to make sure the memory will be allocated at runtime.

I run the code in my trusty Digitalocean VM with "no hang-up" and attach the GNU Project debugger (GDB) while the program is asleep to take a core dump. It seems to be one of the easiest ways to read the program state from memory. Sure, I could use no tools to copy and parse the virtual memory files under /proc/PID/maps yet GDB is more straightforward and is not really anything that runtime specific. It's no pprof.

While the gomemory app is waiting in the background, I run GDB to obtain a core dump. The go program outputs the variables and their addresses so I can try to find those from the core dump

I can examine the core file with the GDB CLI tools or alternatively I could hexdump it. The CLI way is easier though and I can reliably find the memory contents. I don't have to wrap my head around any endianness confusion.

List of files inside the core dump and the executed binary

I can access a specific memory address with the X <address> command. The go program output shows the addresses so it is easy to see their contents. Let's take a look at the memory locations which the program printed to the console.

Memory address and its contents

Cool! At first glance, the first variable does not make any sense, so it must be a reference to some other location. The same goes for the second one which was supposed to contain a string /home/juho/memexp. 1a044 is no sensible ASCII code, it is not a text string so once again it is a pointer to another location. The last one is the actual contents of the variable. 51 in hex is 81 in decimal so the memory location indeed contains the random integer generated by the program.

So what are those referenced addresses? I can use GDB to inspect the memory addresses and even read a range of addresses but I decided, after all, to take a hex dump from the virtual memory map. I ran the app again and took a hex dump of the address space file under /proc/PID/maps from the 0xc0 to 0xc4 range. There we can find the 1a040 address.


So it refers to the PWD env var, interesting!

How about that second one?


Indeed that referenced address contains the printed string. It looks like it is not in the stack frame memory but resides in the heap. The go compiler must have decided it is supposed to be there.

What did this exercise help me understand? Well, firstly I would not try to debug any real-life go application like this. If the only thing I had from a go app is a core dump, I would look into something like delve. That was something I knew already though. Secondly, it is pretty easy actually to read virtual memory. After all, everything is just a file.



Comments

Popular posts from this blog

I'm not a passionate developer

A family friend of mine is an airlane pilot. A dream job for most, right? As a child, I certainly thought so. Now that I can have grown-up talks with him, I have discovered a more accurate description of his profession. He says that the truth about the job is that it is boring. To me, that is not that surprising. Airplanes are cool and all, but when you are in the middle of the Atlantic sitting next to the colleague you have been talking to past five years, how stimulating can that be? When he says the job is boring, it is not a bad kind of boring. It is a very specific boring. The "boring" you would want as a passenger. Uneventful.  Yet, he loves his job. According to him, an experienced pilot is most pleased when each and every tiny thing in the flight plan - goes according to plan. Passengers in the cabin of an expert pilot sit in the comfort of not even noticing who is flying. As someone employed in a field where being boring is not exactly in high demand, this sounds pro...

PydanticAI + evals + LiteLLM pipeline

I gave a tech talk at a Python meetup titled "Overengineering an LLM pipeline". It's based on my experiences of building production-grade stuff with LLMs I'm not sure how overengineered it actually turned out. Experimental would be a better term as it is using PydanticAI graphs library, which is in its very early stages as of writing this, although arguably already better than some of the pipeline libraries. Anyway, here is a link to it. It is a CLI poker app where you play one hand against an LLM. The LLM (theoretically) gets better with a self-correcting mechanism based on the evaluation score from another LLM. It uses the annotated past games as an additional context to potentially improve its decision-making. https://github.com/juho-y/archipylago-poker

"You are a friendly breadwinner"

A recent blog post by Pete Koomen about how we still lack truly "AI-native" software got me thinking about the kinds of applications I’d like to see. As the blog post says, AI should handle the boring stuff and leave the interesting parts for me. I listed down a few tasks I've dealt with recently and wrote some system prompts for potential agentic AIs: Check that the GDPR subprocessor list is up to date. Also, ensure we have a signed data processing agreement in place with the necessary vendors. Write a summary of what you did and highlight any oddities or potentially outdated vendors. Review our product’s public-facing API. Ensure the domain objects are named consistently. Here's a link to our documentation describing the domain. Conduct a SOC 2 audit of our system and write a report with your findings. Send the report to Slack. Once you get approval, start implementing the necessary changes. These could include HR-related updates, changes to cloud infras...