DEV Community

Shubham Jain
Shubham Jain

Posted on

Clean Git History is Kinda Overrated

An important thing I learned early in my career was that best practices should be followed with a pinch of salt. There are so many nuances to software development that a rule followed religiously doesn't take long to bite you in the back. It's not easy to realize that. Often, you read about an idea, you are sold to it (without considering its wider implications) and next thing you know, you're married to the idea.

Utility-first CSS is an example of this. Web developers are tied to the idea of 'separation of concerns' so much that a framework like TailwindCSS is generally perceived as the worst possible way to organize CSS. But, if you sat the concern aside for a moment, it doesn't take long to see the benefits. It clicked for me because I had the seen the struggle to modularize CSS or not repeating the same styles again and again.

I am always on the lookout for things we accept similarly and clean Git history appears to be one of them.

How do I define a clean Git history? It's where each commit is a carefully separated, and each change is meticulously described, where you try to make sure that each change has its proper place (often using interactive rebase).

Don't get me wrong. Clean Git history is useful and sometimes a necessity. In an open source project, if not for clearly scoped commits, releases would be a pain and so would be collaboration.

I am not advocating that your commit should be a massive change with a message "bug fixes."

And I am specifically talking about proprietary software development which has fundamentally different dynamics than open-source.

What I am bothered by is the the lengths developers sometimes go to make sure commits are 'perfect'. I see them interactive rebasing, editing commits, and moving them one above the other. Or more wastefully, taking lines of code in one commit, undoing it in that, and adding them to another commit, often leading to conflicts. The next thing you know you have wasted thirty minutes fixing your rebase. I see them fretting over how should they commit code, so the history appears like an elegant evolution of the feature.

Such efforts don't pass my rule of thumb for judging any best practice: did the time it saved exceed the time it took? A clean Git history, while sounds great in theory, is not really that useful for most teams, simply because you don't look at it that often.

It has never been a big factor for me in trying to diagnose an issue or understanding part of the codebase. What has invariably mattered more is the current state of the codebase. When the code is clean and properly commented (though only where it needs to be), I have never felt the need to go through the history.

Yes, it's useful to just revert a commit if you break the build and it's easier to do something like git bisect, but a) it happens rarely than you think b) it's not a deal breaker if that option isn't available.

I don't have a reason to believe that the time I have spent in abiding by strict rules regarding committing, solving Git-related problems because of that, was anything but a pointless exercise.

If let's say, I have to completely redesign a website, it's vastly better that I finish it off in one go than throw away my time thinking how to make my commit smaller.

Granted this might be truer for rapidly moving software teams than, perhaps, Google or Amazon. But, having worked with a software team of 75+ people, I don't think it'll be every be a cause for a crisis unless the team size is exponentially more. Even then, just few rules around committing is more than enough.

To reiterate, I am not disregarding everything related to committing changes, but only that - don't go overboard. Make meaningful commits with good messages, but you can do without unnecessary effort towards perfection.

  1. If you've created a big change that's too hard to break into individual changes, you're not committing a crime by creating a single commit and describe the changes in the description.
  2. Moving lines of code from one commit to another is time-consuming. It should be avoided as much as possible.
  3. If you need to changes suggested in a code review, it's okay to create a fresh commit than trying to edit the old one using rebase.
  4. Commit description is a great feature to describe a change in more detail, but you should rarely be writing a whole thesis there.

What engineering teams need isn't strict rules, but guidelines. When the team is forgiving around what needs to be followed, it can move at a better pace without compromising too much.

Top comments (3)

Collapse
 
markerikson profile image
Mark Erikson • Edited

I'll have to disagree strongly here.

I am firmly convinced that having a readable semantic Git history, with commits broken into relatively small well-named pieces instead of "big bang" everything-at-once commits, is vital to the long-term maintainability of a project.

Being able to quickly identify when, why, and how a given chunk of code was changed, and see how that code has evolved over time, has been critical as I've tried to track down bugs and understand the design decisions and context behind codebases.

A good commit message should include an issue tracker ID tag, a good short description, and a much longer descriptive section if appropriate.

Collapse
 
devinrhode2 profile image
Devin Rhode

I don't think the author would disagree with any of this. I think the author is more opposed to what I term "improving history"

I Think everyone should know how to rebase, resolve conflicts, and use git absorb --and-rebase to erase trivial typos from history.

It may be best to spend the extra time keeping history clean in open source projects. But then again, devs in Open source can in theory always be reached later on with questions about a line of code they wrote.

I have to agree that the actual code's readability is more important.

What we really want is arbitrary comments on any commit sha/line, and complete and utter freedom to edit commit messages before merging to master. (Git notes subcommand almost works, but should be unified with GitHub line comments.

Collapse
 
lmorchard profile image
Les Orchard • Edited

Yeah, I'd really disagree with this. Clean git commits have come in handy for me over the years for any project that lasts more than a year, has more than 100 commits, is touched by more than 3 people.

Multiple times even just this past week, I've used history and blame (really unfortunate name, btw) on a work project to see when & why things happened in a particular file and who I can ask about it. Commits where unrelated changes are tangled or where a unit of related changes is spread across several commits make it harder to do that detective work months after the fact. Comments in code are great, but deleted comments tell no tales.