DEV Community

Stacie Taylor
Stacie Taylor

Posted on

Git: Are you an over-committer? Squash those commits!

Photo by Christin Hume on Unsplash

Photo by Christin Hume on Unsplash

Are you an over-committer? Me too. Basically, anytime I complete some kind of task, I commit it with a message that details what I’ve done. I do this for a few reasons:

  • I’ve read over and over again: “Commit a lot, commit often”
  • I have kids and work at home. As I’m working, my kids could need me immediately at any second, and I want to make sure my work is saved.
  • Those crazy kids also sometimes get access to the computer and get happy fingers while my projects are open, so I want to make sure I’ve got it all backed up.
  • I often find myself working in short stints and having lots of commit messages helps me retrace my steps and offers me a sort of progress report.

I was recently told by someone who was reviewing my code that I commit too frequently. He suggested, “If you feel that it is helpful for you [to make incremental commits], I would recommend you check out squashing commits with Git.”

After researching what squashing commits even meant, I realized it was the perfect way for me to have my cake and eat it too! It allows you to make as many smaller commits locally as you feel necessary, based on your own preference and workflow, and then squash them down into one clean commit so that your remote repo’s commit history is nice and tidy!

Below, I’ve laid out a step-by-step guide on how to squash commits that can be followed by any Code Newbie!

Steps to squash your commits:

WARNING! Before you start, keep in mind that you should squash your commits BEFORE you ever push your changes to a remote repository. If you rewrite your history once others have made changes to it, you’re asking for trouble… or conflicts. Potentially lots of them!

  • Create and checkout a new branch where you will write your code: git checkout -b [your_branch_here]
  • As you work on your project, make as many commits as you’d like!
  • Find out how many commits you want to squash! You will need the number of commits so that you can tell Git how far to go back during your rebase. The command below will get you the log of commits not including any that are available to master and you can count how many you’d like to squash: git log [your_branch_here] — not master
  • Run an interactive rebase. In the sample below, replace X with the number of commits you want to squash. This will rebase you X commits back: git rebase -i HEAD~X
  • The text editor will pop up and list your commits. Shift + i to type. Replace pick with squash on the most recent commits. Make sure the first commit is still preceded with the pick command. This basically tells Git to squash all of your most recent commits into your first commit.
pick first_commit
squash second_commit
squash third_commit
squash fourth_commit

# Your text editor will give you a bunch of other instructions and options here. Check them out. There is some really helpful information.
Enter fullscreen mode Exit fullscreen mode
  • esc to escape insert mode.
  • :wq to save your changes. (VIM)
  • Another text editor screen will pop up and allow you to write a new commit message that will sum up all of your previous commit messages. Comment out all of the previous commit messages and write your new one.
# This is a combination of X commits.
# The first commit’s message is:
# First commit message

# This is the 2nd commit message:
# Second commit message

# This is the 3rd commit message:
# Third commit message

# This is the 4th commit message:
# Fourth commit message

New commit message here

# Git will list some other instructions and the changes you’re committing here.
Enter fullscreen mode Exit fullscreen mode
  • :wq to save and quit. (VIM)
  • You can double check that it worked by running git log again. You should see only your new single commit!

There are multiple variations of this process and countless workflows and commit preferences among us. I am really interested in how you commit and how you squash (if you do). Please share your experiences so that we can all become more productive and efficient developers and team players!

Discussion (18)

Collapse
190245 profile image
Dave • Edited on

This depends heavily on working practices for the rest of the group, but we also have cake and eat it.

A feature branch is a single developers responsibility. I might create a feature branch, ask someone else for help with something, and they make a commit by way of assisting, but the branch belongs to me.

The last commit before I raise an MR/PR is important, because we use gitlab, and the last commit is the MR default description/title. So we have a template for that, but no-one needs to use a template for a commit unless they're sharing the work with others.

Then, the reviewer(s) do the review, sometimes the commit history is useful, other times it's simply ignored. Just before hitting "merge" the reviewer makes sure the "squash commits" box is ticked.

Then once in a while, each developer runs git gc --prune (or as I alias it, git gcp).

I would posit, that if a reviewer is caring primarily about the number of commits, they're looking in the wrong place.

Also, we have a hard rule for legacy code - format the section of code first, and commit that, then fix the bug/write the feature, and commit that separately. This way, if you've changed 1 line in a 3000 line file, the change is easier to see by comparing commits.

Collapse
elmuerte profile image
Michiel Hendriks

I was recently told by someone who was reviewing my code that I commit too frequently.

And what exactly is the problem with that?

I think squashing commits is generally a really bad idea. You are destroying history.

Collapse
cariehl profile image
Cooper Riehl • Edited on

I don't disagree with you, but for some teams, there is merit to keeping the commit history "clean". Perhaps their workflow involves looking at the commit history often, in order to diagnose issues or find information.

Ideally, they would redesign their workflow to avoid these issues. However, sometimes we run up against deadlines that require us to fit solutions into the current workflow, rather than spending the time to train the team on a new workflow + rewrite the existing tooling to use the new workflow.

It's not an ideal solution, but it is certainly a solution. One that may actually be effective for some teams.

Collapse
elmuerte profile image
Michiel Hendriks

Perhaps their workflow involves looking at the commit history often, in order to diagnose issues or find information.

Which is exactly the reason not to squash commits. Being able to dig through the smaller increments exposes a wealth of information.

A good book on this subject is Your Code as a Crime Scene and the followup Software Design X-Rays.

There is only one place where I use squash commits, and that for work on CD/CI pipelines, as a lot of these systems (e.g. Gitlab) provide no way to test before committing and pushing.

Thread Thread
cariehl profile image
Cooper Riehl • Edited on

Thank you for the links, they seem very useful and I'll take a look at them tomorrow!

I guess I should clarify my original point with some context. I once worked at a company where everyone pushed directly to prod, and the primary workflow for diagnosing issues was "look at the commit history until you find the most relevant description". For some developers, having a long commit history was a major issue, because they couldn't find what they were looking for in a timely manner.

Would it have been easier for us to resolve problems with better source control management and training? Absolutely. But the majority of developers there had very little git experience, and some didn't even understand the concept of "local branches" or "pull requests". And yet, our product was successful, and was making money.

In this scenario, it was (sadly) more efficient for us to consolidate all of our changes into a single "feature commit". Changing our workflow, and teaching the new workflow to everyone at the company, simply required more overhead than was deemed worthwhile. I hated that process, and there's a reason I no longer work there, but the fact remains that the company is still successful despite poor source control practices. For companies like that, who value functionality over best practices, squashing commits can be an effective shortcut to improve their workflow.

I agree with you, that having a detailed commit history provides significant value and should always be the go-to. But some teams continue to thrive without following best practices, and articles like this can still be useful to them, even though they would be better served by improving the root issues in their workflow.

Thread Thread
elmuerte profile image
Michiel Hendriks

Nothing wrong with pushing directly to main line. In fact, that's what I prefer and is basically required if you want to practice Continuous Integration.

Looking at the commit history to find the most relevant description isn't really the best approach to diagnosing issues. It's the second or third step. You should locate where the problem occurs, and only then look at the history. Just because somebody changed part of the code which now causes a problem doesn't mean that this change introduced a problem, it might just uncovered a new problem.

With small and frequent ACID-style commits with proper descriptive commit messages you will be able to reason about why changes were made, and possibly the reasoning behind it. Things which get lost with a squash commit, because then you don't get information about which major thing was changed but not why the smaller parts changed. What was changed is also visible from the changed code.

With software development recording the why is more important than recording the what. Because the latter we already solved.

To summarize your and my point:

If too many commits is a problem, then you might be working wrong.

Squashing commit does not solve this problem, it just tries to hide it under a rug.

Collapse
jessekphillips profile image
Jesse Phillips

I recently put this cheatsheet together exactly to emphasize this type of workflow.

It is nice that you provide some of the vi commands as that would be the default editor.

One recommendation I have though is to consider branches as owned by a developer. These can be handed off or collaborated on but it should be expected that without explicit handoff the owner may rewrite history.

Vim tip, using ciw will "Change inside word" which is good to use for changing "pick"

Collapse
jmau111 profile image
Julien Maury • Edited on

edit: love rebase -i
I have quite the same point of view as you, but sometimes I see a bad practice with that. Some developers keep the code locally and squash it over and over to make it perfect. It's not as safe as it seems, your computer can crash and you might lose your work.

Collapse
cipharius profile image
Valts Liepiņš

Merge request oriented workflow even allows you to push the unorderly commits to remote repository, on a seperate MR dedicated branch. That means your commits will be backed up on remote repository, which can still be safely rebased, as long as you have agreement with coworkers not to work on other people branches. Once the MR is ready to be merged, it can be squashed and remain as a clean, single commit on the main branch!

Collapse
nirebu profile image
Nicolò Rebughini

One thing I do to not having to count commits is use git rebase -i main (you can change main with the name of the main branch). It also helps in keeping in sync a feature branch on top of other people work.

Collapse
melvyn_sopacua_afcf30b58a profile image
Melvyn Sopacua

We in the industry as a whole, should really stop advocating one approach as the "new hot stuff" (and your title implies this). Especially knowing that there will be people using this without understanding what they are actually doing. And then suddenly you are looking at a git history for a file and see the same change popping up twice with other commits in between, because of a git merge long-living into feature in between two squashes.

Now, me - the bug hunter - can't make sense any more of what happened to the file and what was missed. Combine this with commit messages that only describe WHAT was done, not WHY (I can read the what in git log -p thank you) and you're at a loss why Jack "corrected an error" that looked to be totally legit and is now causing issues downstream in different parts of the code. In that squash are tons of other files and changes, that I now need to consider as part of the problem, instead of the more isolated commits the bug was part of. The timeline is also not as a real rebase would be, especially when people are made afraid of rebase, so they merge development into feature, but then squash as well "cause it makes history more concise" or whatever the reasoning is. Very simply put: if you squash 40 commits, how useful is git bisect to isolate a problem?
Again - it boils down to knowing how git works, how version control works, what tools it has. Next to your IDE, it's the number one tool you use, yet you'll be hard pressed to find devs who can use it for more then just a way to share and backup code.

Another - totally different thing to consider:
If you're a remote working freelancer, have the courtesy to not squash. It hides commit times. So there's no way for me to verify if you actually worked the time you invoiced and a good CFO will ask me that question.

Collapse
lucassperez profile image
Lucas Perez

Also, just to add a tiny info, instead of going HEAD~n to get the n last commits, you could pick the hash of the commit you want to go to.
Let's imagine that I have these 4 commits:

abc123    Last commit I made
def456    Yet another message
ghi789    Another message
jkl012    A commit message
Enter fullscreen mode Exit fullscreen mode

If I want to rebase the last 3 commits, I could pick the hash ghi789 and execute the rebase command like this:

git rebase -i ghi789^
Enter fullscreen mode Exit fullscreen mode

We have to put the ^, otherwise git will pick all the commits but the one indicated by the hash.
Alternatively, you could pick the hash of the commit that came immediately before the one I want, in this case, jkl012, so this would also work:

git rebase -i jkl012
Enter fullscreen mode Exit fullscreen mode

In the end it's all the same and we should do whatever we like and is used to, but I like knowing multiple ways of doing the same thing! 😅

Collapse
lucassperez profile image
Lucas Perez

I don't think I like squashing all of the commits. In fact, I like many small commits, but if for any reason this is an issue to your team, you could squash just a few of them and keep a smaller but still descriptive historic. The rebase -i is extremely flexible for that (:

Collapse
seokjeon profile image
Se-ok Jeon

Thx for this! This is really what I wanted. Helped A LOT.
Can I translate in Korean this post? If you don't mind, I wanna share this awesome post in Korean. Surely, There will be a linke directing to this original post.

Collapse
raphink profile image
Raphaël Pinson

A simple approach is to use pull requests and choose to squash them instead of merging. This way, you get one commit per feature without having to squash manually.

Collapse
miguelmj profile image
MiguelMJ

Thank you! Really useful for an over-commiter like me 😄

Collapse
victoryarema profile image
Comment marked as low quality/non-constructive by the community. View Code of Conduct
Victor Yarema

This is just the worst recommendation ever.

Collapse
florentbo profile image
Florent Bonamis

This is the worst argumentation I have ever read ;)