The most valuable comments I find in any given codebase look like this:
These comments do not explain what the code does. They explain why the code looks the way it does. They bring into light historical context, failed attempts, and external constraints that are otherwise invisible.
We all occasionally fail to communicate our intent to the next developer. That is normal and unavoidable. What matters is leaving a clear mark when something non-obvious or hacky is done on purpose.
Increasingly, the “next developer” is a metal-headed clanker: an LLM. LLMs are actually quite good at reading and understanding these comments. The problem is not comprehension. The problem is behavior. They tend to remove these comments, or when they introduce new hacks—either because they were instructed to or because they did so autonomously—they fail to leave the valuable "hack warning" comment traces.
Anyone who works with LLM-generated code has seen the opposite failure mode as well.
By default, LLMs are extremely verbose with comments. They happily pollute a codebase with low-value noise: restating what the code already says, line by line. Writing comments appears to be a behavior that is surprisingly hard to disable.
To add insult to injury, they often overwrite or delete the few comments that actually matter, the why comments, while adding commentary explaining trivial stuff that is self-evident from the code itself. The result is strictly worse than either a well-commented or an uncommented codebase: the signal is gone, the noise remains.
This led me to wonder whether hack documentation should live somewhere else entirely. Should there be a HACKS.md? Or would it be enough to give the agent explicit instructions about how to treat this class of comments?
Since there is no HACKS.md, I added a short, explicit rule to AGENTS.md:
NB; (nota bene).
Comments
Post a Comment