DEV Community

Jeremy Friesen
Jeremy Friesen

Posted on • Originally published at takeonrules.com on

Commit Messages Do Matter

Responding to Those Who Say They Don’t

In Red Meat Friday: Commit Messages Don’t Matter, the author references Matt Rickard’s post Commit Messages Don’t Matter. Below is the Irreal quote:

Matt Rickard makes the case that commit messages don’t matter and that our time is better spent elsewhere. My personal policy is to give a hint as to what the commit does and leave the details to diff or other appropriate commands. Once you get out of code police mode, it’s pretty clear that Rickard is right. But, of course, none of us want to get condemned for apostasy so we pretend to agree with the received wisdom.

On the surface, I disagree with the above.

Commit messages do matter; they are your opportunity to create one more artifact for the future archaeological efforts of reconstructing context. Your chance to say the why, to paraphrase the what, and to connect to prior work.

Let’s dig in to where I agree and build out from there.

From Matt Rickard: “pull requests are the de facto reviewable unit and workflows should probably be designed around them.”

The above alludes to an assumption of pull request via something like GitHub. But to generalize, let’s talk about them as change sets. The conversation around the change set is critical

The change set is what we are accepting into the code base. I like to think of the commit message as the cover letter for that change.

My personal philosophy of commit messages is to include the following:

  1. Short descriptive title.
  2. A statement of why the change.
  3. References to issues or other changes.
  4. Optionally a before and after description.
  5. Optionally the exception/call trace this commit addresses.

I then use that message to populate the pull request’s initial message.
I wrote about that in Adding Emacs Function for my Forem Pull Requests.
Ideally, once we merge the change set, we’ll also have a reference to the pull request as part of the commit. At a minimum we have the SHA and can look that up. If you use GitHub’s squash and merge strategy, it will append the pull request number to the commit.

The goal in all of the above is to provide way finding for those reviewing code as well as those later trying to understand context. Those references are an effort to connect into the potential “knowledge graph” that is being built through the development of the software.

All of this follows the principle in software development that you will likely spend more time reading than writing.

Top comments (3)

Collapse
 
brense profile image
Rense Bakker

I don't think the "spend more time reading than writing" was meant to be interpreted as: "you should write novellas in your commit message so other devs spend more time reading" :B In all seriousness though... I've never read anyones commit message when doing a code review and I dont think anyone ever read mine... Commit messages probably rank somewhere in the top 10 of places where you can safely store secrets 😅

I think other principles should also be considered like: "dont waste time on things that do not add value". In the odd chance (1 in a trillion?) that your commit history is going to be used as forensic evidence to solve a murder, thats going to be valuable info for the detective, but in most other cases, the commit hash is all you need to revert your branch to a point where it was hopefully not broken.

Collapse
 
jeremyf profile image
Jeremy Friesen

I write my commit messages as the primary body for my pull requests, following the principle of don't repeat yourself.

but in most other cases, the commit hash is all you need to revert your branch to a point where it was hopefully not broken.

Yes, the SHA is what you need to reverse but haven't you ever read code and wonder "Well what is that there for?" A code comment is the quickest to find followed closely by a git annotate.

Collapse
 
brense profile image
Rense Bakker

If I see code that is not self explanatory I use git blame to see who wrote it and then I ask that person to do a better job 😁

In the event that someone makes changes to legacy code that is not self explanatory, I ask them to refactor it. If nobody understands the legacy code well enough to do a full refactor, but the code works in production, I ignore its existance as long as possible. When that strategy fails I start pushing for time to do a full rewrite of the part of the program that has bad legacy code and if that also fails and they still try to make me responsible for crap code that I did not write, I make sure I'm out of there ASAP.