loading...

Up your Git game and clean up your history

Christopher Kade on June 18, 2019

This post is aimed for people who want to learn how to use commands such as rebase & learn a few tricks to have a nicer Git experience. Al... [Read Full]
markdown guide
 

Rewriting history on solo projects is fine but as soon as you work with another person you should consider it extremely harmful to do so on non-local branches.

From git-scm.com/book/en/v2/Git-Tools-R...

Don’t push your work until you’re happy with it
One of the cardinal rules of Git is that, since so much work is local within your clone, you have a great deal of freedom to rewrite your history locally. However, once you push your work, it is a different story entirely, and you should consider pushed work as final unless you have good reason to change it. In short, you should avoid pushing your work until you’re happy with it and ready to share it with the rest of the world.

 

Correct me if I'm wrong, but I didn't get the impression from reading this article that the author suggests we should be editing the history of the master branch or any other shared branch. I'm more under the impression that it's about tidying up your own fix or feature branch shortly before it goes into master, and I do not see any harm or problem with that, even in a team.

To the contrary, the benefit of a cleaner, more readable history that removes all clutter such as quick and dirty in-between commits and iterations that didn't end up in the final result, should bring all contributions into a sensible timeline that is easier to follow.

Talking about big teams, here's the stance from Phabricator's development team, a development and collaboration tool that originated from Facebook and turned into an Open Source project:

A strategy where one idea is one commit has no real advantage over any other strategy until your repository hits a velocity where it becomes critical. In particular:

Essentially all operations against the master/remote repository are about ideas, not commits. When one idea is many commits, everything you do is more complicated because you need to figure out which commits represent an idea ("the foo widget is broken, what do I need to revert?") or what idea is ultimately represented by a commit ("commit af3291029 makes no sense, what goal is this change trying to accomplish?").

"One idea = one commit" means that the entirety of the fix or feature branch should be squashed into a single commit in the mainline branch. A single commit that easy to apply and revert if necessary. All in-between steps should be removed. The proposed solution to this issue is:

...squashing checkpoint commits as you go (with git commit --amend) or before pushing (with git rebase -i or git merge --squash), or having a strict policy where your master/trunk contains only merge commits and each is a merge between the old master and a branch which represents a single idea

Source

Several other articles online give the same idea that squashing and rebasing is the way to go, especially when you work in teams:

Editing the history is only problematic when the editor doesn't 100% know what they are doing, and even then, destructive changes can be prevented by simply setting up the git repository server to reject destructive changes to the shared mainline / development branches.

On GitHub you can set up branch protection rules to prevent any push --force to any branch of your choice from being accepted. Phabricator doesn't allow "destructive changes" to any branch by default. Bitbucket offers the same protection under "branch permissions".

Practically speaking there is nothing to worry about with proper repository configuration and with the right education. And articles like this one are helpful

 

I didn't get the impression from reading this article that the author suggests we should be editing the history of the master branch or any other shared branch

Absolutely, that was not my intention, if it wasn't clear then that's my bad.

On GitHub you can set up branch protection rules to prevent any push --force to any branch of your choice from being accepted.

Yeah, I don't see an instance where force pushing to master would be recommended, glad you mentioned these protections for others to know about.

And thanks for sharing some resources, I'm glad you commented 😄

 

Hi yes I agree there's nothing in the article that says rewriting master is a good idea, however I felt it was important to state that there are problems. Rewriting history and rebasing are not silver bullets and can cause issues without proper procedure.

Admittedly reading my comment back it sounds like I've dismissed the whole article on that basis but that was absolutely not my intention!

 

I totally agree with your statement, all these manipulations should be done locally before pushing your work. 😀

 

"Fixing" history is a misguided (and dishonest) effort--a waste of time. Rebase is a powerful git tool and should be learned and used appropriately, but maintaining a useful git log should be accomplished with forethought and disciplined adherence to good commit practices. That said, I'd rather have an imperfect but unmolested git history. A "clean" history is just a lie.

 

Bs. Clean history represent a step by step application growing and not how u or u team work. So isnt lie. No sense of blame anybody only the log of the build from first line to current.

 

Thanks for the article! It was a nice read/refresher. Since the audience is for people wanting to up their git game, I would suggest adding some messaging around the dangers of "force push" and maybe reference force push with care which leverages --force-with-lease so people don't accidentally overwrite team pushes. :)

Another noteworthy thing might be to git rebase --abort if things go unexpectedly sideways during a rebase (conflicts or other strange/unexpected behavior). It's nice to know, especially when getting started, how to back out of a command safely.

Thanks again!

 

Great points Nick, thank you for sharing them. I did not know of --force-with-lease.

I've added a section called "On the dangers of force pushing & other things to note" which mentions your comment.

Thanks again 🙂

 

Nice walkthrough. Rebasing is often painted as advanced, but I think it's best to play with it early on, and not fear it later.

Git rebase lets your remodel your history to your will. See it as a way to manipulate your list of commits on a given branch.
...
To do so, we'll use the interactive mode of git rebase, which lets us apply the rebasing with a nice interface.

Technically true, but I wouldn't phrase it like that. IMO, taken together these risk implying untrue things — that a branch's history can be arbitrarily rewritten with a plain rebase, that it's why rebase exists, and that --interactive just changes the UI.

I'd instead say:

git rebase allows you to replay the changes introduced in a set of commits on top of a specified base. It works by repeated cherry-picking (i.e. applying changes introduced by a commit on top of a different one). --interactive allows you to edit the changes before they're applied.

This leads to a special form of rebase — interactive rebase where the source branch is also the target — where you can arbitrarily rewrite a branch's history.

git-scm.com: Rewriting History


This very frequent scenario will have us rebase our second PR on master so that it gets the new code merged from the first PR.

Just to add context: rebasing works here, but merging master is likely better for PRs on a team.

  • if there are conflicts, you'll only have to fix them once, rather than repeatedly across commits
  • rebasing can cause reviewers to lose work
  • you can't assume that others haven't checked out a PR branch
  • it's the truth

I'd say only the first two are potentially serious, but I can't think of an upside that'd make it worth dealing with them.

 

Thanks for the post, very good practical examples to try rebasing.

I'm just curious how did you achieve this lonely commit? 😀
lonely commit

Example 3: rewording a commit

I think interactive rebase in this example is overkill.

git commit -- amend -m "new message" will do the work.
But changing a message for commit head~2 is impossible with amend, so interactive rebase with reword will solve such tasks perfectly.

 

With pleasure, super glad it helped.

This last commit is just an automatic commit made to deploy to Github Pages

github pages

A whole different thing, so don't worry about it !

Thanks for mentioning amend 😄

 

More often than not I've got one of those commit histories that look like the "OH LORD WHAT HAVE I DONE" graph, and even rebase is hard to use to clean up -- this happens most often when I have something that fixes one thing, but needed two or more changes. (I really should get into the habit of committing after each file change, even if it leaves HEAD broken.)

In those cases, I use a process of checking out master into a cleanup branch, git cherry-pick -n (i.e., don't commit) the changes from the branch I'm cleaning up onto the cleanup branch, and then git reset. Now all of the commits are squashed on a per-file basis. I then repeatedly use git add -p to build a new set of per-file commits that catch me up by replying y until the next change would go into a new file. Then commit and repeat until all the changes have been built into commits.

It loses the commit order, but very nicely collects the changes into per-file changes. Hm. This might have been better as a post instead of a comment!

 

Bro, thank you. This article has been a great help for me today.
I messed things up with git reset HEAD~. Luckily, gitlense(vscode ext) had a copy of the files that vanished. After doing the necessary changes, i did a rebase to fix up the previous commit as you mentioned. 👍

 

Hi Christopher,
Nice article, many thx!

If I may suggest, instead of fetch remote parent branch, I used to use git pull --rebase cmd.
For example, if I need to rebase 'develop' branch from master, I'll do git pull --rebase origin master.
This prevent me to forget about fetching the HEAD of master every time I need to rebase.

Another point, if you really need to rewrite history of a project, I would suggest to learn git reflog cmd also. It's really convenient to recover deleted commit for example, or to cancel a mishandling rebase...

Thx again

 

Thanks for the great article! Is rebasing/squashing only useful for auditing your git history and/or making it easier to read?
I guess what I'm asking is - if I recommend doing this to my team, what are some of the practical applications I can mention to get them on board?

 

Hey man, great article and especially great title choice! I came from a dev.to mail and upon reading it I immediately thought: this is about rebasing.
The Table of Contents anchors don't work though, had to scroll manually

 

Isn't the combo of git fetch and git rebase origin/master equivalent to git pull --rebase origin/master?

 
 
 

Many thanks for this excellent article, something I needed to get more comfortable with Rebase. Much appreciated!

code of conduct - report abuse