Skip to main content

Pragmatic developers guide to river crossing

TODOs, FIXMEs, issue links, and hacks are terms in a codebase that get the blood boiling for some, but I am almost always glad when I see them nowadays. The terms have saved me from ghost chases more times than I want to know. My thoughts on the matter are best explained by this little tale.

Let's imagine a dev team tasked with figuring out a way to transport a wolf, goat, and cabbage across a river. They have been contracted by a client expert in transport systems

After a discussion with the domain specialists, the developers conclude that renting a boat for the river crossing is the way to go since it is the most cost-effective. 

The transportation design is abstracted with an open-source library. Perfect! Such an easy job, right? Provided, of course, that the library API is decent.

It turns out the library, although simple, could be better. It is imperative, stateful, and cumbersome—very outdated stuff. There are fortunately, only three public functions, init, load and moveToNextBank. The library needs to be initialized with the transported items while load takes an index of an element in said item array. One transported thing at a time, weird!

They start with this code.

But when they run the program, they get an index out-of-bounds error. The baffled developers tear their hair and try to figure out what is happening.

After some trial and error, they realize that the payload array loses the last index if the wolf is moved first. Bizarrely, they notice that if they carry the cabbage first, the middle index (goat) disappears from the boat state. 

After hours and hours of brute force hacking, they figure out a bunch of edge cases that corrupt the library's initial state. Working around the bugs, they write the following code successfully transporting the payload.

Job done! They check the code and move to the next assignment.

🤔

We can do better! Let's rewind to the beginning.

In an alternate timeline, as the team taking the task notices the library quirks, they search the internet for alternatives to the transport library and unfortunately discover none. Accepting the fact and given the time pressure, they reluctantly stick to it.

When they run into bugs in the library, they write issues and pull requests to the open-source repository. Their code ends up like this.

The team understands the code could be better and could also isolate the unclean API in the codebase, but since they contributed to upstream, they decided to live with the clutter until it is released. They also write a unit test that fails if the library is updated, enforcing a rewrite.

Once you work with something confusing, clarify in your code (not just in the issue tracker) why it is illogical and why. Of course, in the ideal world, one creates abstractions that make it simpler but sometimes, it is just not possible or feasible.


They also could have just asked OpenAI!

Comments

Popular posts from this blog

I'm not a passionate developer

A family friend of mine is an airlane pilot. A dream job for most, right? As a child, I certainly thought so. Now that I can have grown-up talks with him, I have discovered a more accurate description of his profession. He says that the truth about the job is that it is boring. To me, that is not that surprising. Airplanes are cool and all, but when you are in the middle of the Atlantic sitting next to the colleague you have been talking to past five years, how stimulating can that be? When he says the job is boring, it is not a bad kind of boring. It is a very specific boring. The "boring" you would want as a passenger. Uneventful.  Yet, he loves his job. According to him, an experienced pilot is most pleased when each and every tiny thing in the flight plan - goes according to plan. Passengers in the cabin of an expert pilot sit in the comfort of not even noticing who is flying. As someone employed in a field where being boring is not exactly in high demand, this sounds pro...

PydanticAI + evals + LiteLLM pipeline

I gave a tech talk at a Python meetup titled "Overengineering an LLM pipeline". It's based on my experiences of building production-grade stuff with LLMs I'm not sure how overengineered it actually turned out. Experimental would be a better term as it is using PydanticAI graphs library, which is in its very early stages as of writing this, although arguably already better than some of the pipeline libraries. Anyway, here is a link to it. It is a CLI poker app where you play one hand against an LLM. The LLM (theoretically) gets better with a self-correcting mechanism based on the evaluation score from another LLM. It uses the annotated past games as an additional context to potentially improve its decision-making. https://github.com/juho-y/archipylago-poker

"You are a friendly breadwinner"

A recent blog post by Pete Koomen about how we still lack truly "AI-native" software got me thinking about the kinds of applications I’d like to see. As the blog post says, AI should handle the boring stuff and leave the interesting parts for me. I listed down a few tasks I've dealt with recently and wrote some system prompts for potential agentic AIs: Check that the GDPR subprocessor list is up to date. Also, ensure we have a signed data processing agreement in place with the necessary vendors. Write a summary of what you did and highlight any oddities or potentially outdated vendors. Review our product’s public-facing API. Ensure the domain objects are named consistently. Here's a link to our documentation describing the domain. Conduct a SOC 2 audit of our system and write a report with your findings. Send the report to Slack. Once you get approval, start implementing the necessary changes. These could include HR-related updates, changes to cloud infras...