DEV Community

Oliver Jumpertz
Oliver Jumpertz

Posted on

You should consider adding more detail to your commit messages

Whether you're working on your own or with a team, I'm pretty sure that everyone of us has already written commit messages like this:

$ git commit -m "Fixed some stuff"
Enter fullscreen mode Exit fullscreen mode

or you do at least write your commit messages like this:

$ git commit -m "Added a new endpoint for trucks"
Enter fullscreen mode Exit fullscreen mode

At some point, your git log might end up like this:

commit ae43284df673cdd3fa14b96766fd3c6fe2a06051 (HEAD -> master)
Author: Oliver Jumpertz <you@me.com>
Date:   Sat Aug 31 13:45:37 2019 +0200

    Added a new method to calculate a cats' weight

commit 28773ac45fda5111d7258789492fc4a7836c9411
Author: Oliver Jumpertz <you@me.com>
Date:   Tue Aug 13 11:31:43 2019 +0200

    Fixed a bug

commit b5ee1d9c09545b03470bb96f64cbad220bc30754
Author: Oliver Jumpertz <you@me.com>
Date:   Thu Aug 1 13:43:37 2019 +0200

    Fixed some stuff

commit 573af7670d7bcb19c77fbf323e34fbfb536037ee
Author: Oliver Jumpertz <you@me.com>
Date:   Mon Jul 29 15:22:45 2019 +0200

    Initial commit
Enter fullscreen mode Exit fullscreen mode

Some basics

A reminder about what git is

Git, first of all, is a decentralized version control system. Its purpose is to enable you and others to work on code together, without having to copy around files and overwriting each others changes in the process (yea, this maybe is really simplified). It also enables you to push your code to one or multiple (yea, git is meant to be decentralized, remember?) remote locations where others can access your changes, and push theirs to, respectively.

Commits

Every change you make to the repository is a commit, and you can give every commit you make a description.
This is the commit message.
Whenever you see a change to the repository, it's the commit message that is supposed to give you some kind of context to what that change is.

Git does also come with a neat feature which is $ git blame <file> that annotates each line within a file with information about who made the last change to that line when and with which commit.
Most editors and IDEs nowadays allow you to annotate code directly within them, adding in the commit message, and allowing you to diff different versions directly within the editor, so you do not have to switch to your terminal and back again and can reason about why that line or block of code is as it is.

What can we use that for?

Reason about why code you see is as it is

A possible scenario

As projects grow and time goes on, we tend to forget about certain parts of it. Maybe we're switching between different projects regularly, or our code base is just so huge that we never touch everything within a single task. Most of the times we work in a team, and everyone is contributing to the code base.

At some point we might stumble upon some code we or anyone else wrote some time ago. It may be accidentally, or it may be intentionally, because we debugged that REST API that does not return all the cats we would expect and landed at that exact function we now try to find our bug in:

async function getCats() {
  const cats = await fetchHouseCats();
  return cats.map(cat => Object.assign({}, cat, {type: 'house cat'}));
}
Enter fullscreen mode Exit fullscreen mode

Why are we only fetching house cats? What about the wild cats? At that moment, we can only speculate why that function is implemented as it is. When we look further, we see another function:

async function getWildCats() {
  const cats = await fetchWildCats();
  return cats.map(cat => Object.assign({}, cat, {type: 'wild cat'}));
}
Enter fullscreen mode Exit fullscreen mode

It seems as if this function is better named as it reflects what it's actually doing. Fetching our wildcats, giving them the appropriate type and then returning them for further usage.

Let's take a look at our git history and try to find out if this is the first iteration of that code or if it looked different in the past.
And as we do so, we get to see this old version of the code:

async function getCats()
{
  const houseCats = await fetchHouseCats();
  const typedHouseCats = houseCats.map(cat => Object.assign({}, cat, {type: 'house cat'}));

  const wildCats = await fetchWildCats();
  const typedWildCats = wildCats.map(cat => Object.assign({}, cat, {type: 'wild cat'}));

  return typedHouseCats.concat(typedWildCats);
}
Enter fullscreen mode Exit fullscreen mode

By using $ git blame cats.js (or our editor equivalent) we find out that Mark changed that portion of the code last Thursday. Now it's Monday, he is on vacation and thus not reachable.
The commit message reads:

commit f7fb3f38aa67699bf815a8e28455ac541a8a0e52
Author: Mark Turboprop <mark@yours.com>
Date:   Thu Aug 22 15:57:12 2019 +0200

    Split up getCats into house cats and wild cats.
Enter fullscreen mode Exit fullscreen mode

The change happened on purpose, that's for sure. It seems that, however, neither has he renamed the old getCats function nor has he changed its usage in our /cats REST endpoint.
By following the usage of getWildCats we find out that there is indeed two new APIs: /housecats and /wildcats. Each using one of the functions. But Mark has not created a new method that combines those two functions so that /cats still works.

As we do not have any further information and Mark isn't reachable for the next three weeks, we have to dig further, try to find the ticket/issue Mark has worked on, which takes us some time, and find out that there was indeed a feature request for individual APIs while keeping functionality of the old one.

To fix our bug, we rename getCats to getHouseCats and implement a new function getCats that combines the two to make /cats work again. Then we make sure that all three APIs are now working correctly and call it a day.

How we could have saved a lot of time

We were able to fix that bug and make our users happy again but it took us more time than it would have had to.
First we had to make assumptions about why things were changed, and then we had to switch to the browser and make a lengthy search through our tickets until we finally found the actual reason for why that particular code was changed.

A first improvement to make our life easier in the future would be to actually include a reference to whatever system we use to track our tasks. The commit could have looked like this then:

commit f7fb3f38aa67699bf815a8e28455ac541a8a0e52
Author: Mark Turboprop <mark@yours.com>
Date:   Thu Aug 22 15:57:12 2019 +0200

    CATS-1234: Split up getCats into house cats and wild cats.
Enter fullscreen mode Exit fullscreen mode

With that reference at the start of the commit message, we would have been able to directly go to our system of choice, open the story, take a look what was requested, and then compare it to what was implemented.

Depending on the type of developer you are, this might be enough for you, and that's okay. It just has to work for you and the ones you work with.

I, personally, prefer to have as much information as possible directly at my hands, without the need to switch between different applications.

There's another way to do your commits in git.
If you are using bash, you can do it like this:

$ git commit -m 'This is the first line
this is the second line
this is the third line
...
'
Enter fullscreen mode Exit fullscreen mode

or if your command line does not allow for multi-line Strings with single ticks you can do it like this:

$ git commit -m "Head line" -m "Content"
Enter fullscreen mode Exit fullscreen mode

to add more detail to your commit message. We'll get to why and what in a second.

A third option is to configure your editor of choice for git commit messages with

$ git config --global core.editor "vim"
Enter fullscreen mode Exit fullscreen mode

while replacing vim with whatever editor suits you best.

From then on you can just

$ git commit
Enter fullscreen mode Exit fullscreen mode

and edit the actual commit message within that editor.

As we can now add more information we can provide more insights for us and others to consume in the future, when necessary.

A scheme I like to use is the following:

commit f7fb3f38aa67699bf815a8e28455ac541a8a0e52
Author: Oliver Jumpertz <you@me.com>
Date:   Thu Aug 22 15:57:12 2019 +0200

    <ticket-ref>: <head-line describing shortly and as best as possible what was done>
    <blank-line>
    <what-you-actually-did> because of <reason-why-you-did-it> which leads to <outcome>
    <blank-line>
    <possible-drawbacks-or-implications>

Enter fullscreen mode Exit fullscreen mode

Here is an example:

commit f7fb3f38aa67699bf815a8e28455ac541a8a0e52
Author: Oliver Jumpertz <you@me.com>
Date:   Thu Aug 22 15:57:12 2019 +0200

    PROJ-9876: Changed the sorting implementation of getTrucks to a stable one.

    Changed the sorting implementation of /trucks from QuickSort to MergeSort to prevent trucks with identical horse 
    power but different ids, which were later added, changing the position of those trucks in the resulting JSON 
    between calls because some legacy clients frequently run into errors with their response caching when positions of 
    already known trucks change. The new sorting should now produce stable results, permanently, 
    and thus fix the problem with legacy clients.

    Memory usage in peak times might go up a little and should be monitored.
Enter fullscreen mode Exit fullscreen mode

What we now have is three types of information.
We have a head line, stating what issue we worked on and shortly describing what was done.
If that's not enough there is a more detailed description block one can use to his or her advantage if necessary.
The third block provides us with insights that whoever did the implementation or change knew some implications and thought about it. It can also be used as a note for things we maybe have to do afterwards.

Create a change log for your next release

As we have now greatly improved the quantity and quality and information within our commit messages, it's relatively easy to generate a changelog just from the commit messages.
If you use the template up there, just extract the middle section:

[...]
<what-you-actually-did> because of <reason-why-you-did-it> which leads to <outcome>
[...]
Enter fullscreen mode Exit fullscreen mode

and you can pretty accurately say what was done since the repository was last tagged for a release.

Conclusion

Whether you are doing development for a living or just as a hobby, always think of your future self and the people you work with. No one wants to spend too much time on trying to find the information they need or making assumptions about something when someone could just have told them.

Top comments (4)

Collapse
 
mungojam profile image
Mark Adamson • Edited

I personally like commits to be fairly cheap to do. At the same time, I like people to generally use PRs to make changes as it is then clear exactly which commits made up a particular change. There's a balance and personal preference element as always.

Something that would really help me is if there was an easy way to find out what PR(s) a particular commit was part of. Then the blame could lead to the PR with the larger context. Any idea if that's possible in GitHub or other? I suppose if you have a git tool that shows you the commit in a tree then you could probably find the merge commit which then has the PR link in its name

Collapse
 
oliverjumpertz profile image
Oliver Jumpertz • Edited

If you're using GitHub, it's pretty easy to do.

With a context switch to the browser:
github.com/<username>/<projectname>/commit/<hash>
(as an example take this: github.com/expressjs/express/commi...)
If you go to the commit within the express js repository, you will notice that it says master (#3778) 4.17.1 4.17.0 in the header of that commit. #3778 is the Pull Request this commit found its way onto the master branch. :-)

If you want to stay in your local environment it gets a little trickier but is still doable!
I do not exactly know where I got this from but the disclaimer is, that I also found this somewhere, some time ago, so note that it's not my own.
This are two of the aliases I use (~/.gitconfig):

[alias]
    find-merge = "!sh -c 'commit=$0 && branch=${1:-HEAD} && (git rev-list $commit..$branch --ancestry-path | cat -n; git rev-list $commit..$branch --first-parent | cat -n) | sort -k2 -s | uniq -f1 -d | sort -n | tail -1 | cut -f2'"
    show-merge = "!sh -c 'merge=$(git find-merge $0 $1) && [ -n \"$merge\" ] && git show $merge'"

git find-merge <commit-hash-you-want-to-track> will give you the hash of a merge commit, the commit you are searching for was part of, and git show-merge <commit-hash> will directly show all information about that merge commit.
Is this something that could help you?

Collapse
 
mungojam profile image
Mark Adamson

That's great, thanks for those tips, I shall try both next time

Thread Thread
 
oliverjumpertz profile image
Oliver Jumpertz • Edited

Great!
Just msg if there's something not working. :-)