loading...

and the second top voted question on StackOverflow is...

tomerbendavid profile image Tomer Ben David Updated on ・7 min read

If you access: https://stackoverflow.com/questions?sort=votes you would see the top voted questions on StackOverflow.

https://thepracticaldev.s3.amazonaws.com/i/9gg2c7r1wphf84q24hhq.png

It's not hard to notice that 3 out of top 4 voted questions on StackOverflow are on git!

This may mean one of two things (or both), git is hard / unintuitive or git is one of the most common technologies used. It's the combination of both.

When you try to understand git you notice that in order to get what it wants out of you (or you out of it) you need to understand the underlying infrastructure. This is mainly because you need to be accustomed to it's API which is tightly coupled to it's underlying building blocks. This is what makes most of the confusion, the API (in our case it's CLI) is something that some say is a leaky abstraction on top of it's underlying model.

Enough with ranting, let's analyze the top voted question for git.

The question is how to undo the most recent commit in git?.

Or in a picture:

The user says he has committed but haven't pushed yet.

Why do you think this is perplexing in git?

  1. You might expect an api with revert/undo/checkout into a different commit.
  2. It's committed do you undo with a new commit? can you go back in time?
  3. Git says everything is stored forever so how do you really undo?
  4. Fix the recent commit (there is a command for that).
  5. What is the command to go back in history?
  6. There is always this warning about changing and undoing stuff that was already pushed are we at risk?

Even if some of the above api's exist, there are too many options, which one to choose which is your best shot?

Let's review the first answer

The first answer suggest him to use the git reset command. The best way to think about git reset is as if you can ask the HEAD pointer to move across the git history graph.

That was too many terms, HEAD, reset, pointer, graph

In git we commit.
Each commit adds to the history, it's a commit in time right?
History can be described as a graph, meaning each commit is pointing to another commit (parent commit in case of non merge commit), and thus we have a directed acyclic graph, yet another term.

So the answer asked the user to go back in time, didn't that what he wanted to? The answer told him to do that:

git reset HEAD~

We have moved to the past!

HEAD is pointing to the current branch location, ~ is pointing to minus one pointer, the parent of the current commit that head is pointer to.

So by using the reset command we are telling git:

Git whatever you are pointing to right now (HEAD) point to minus 1 place, the previous place in other-words

And git happily does this for us.

Note that as we didn't specify any flags (like --hard) to the reset commands it will not change anything in our working directory, so only the head moved there.

As we wanted to revert something in our current working directory, and we just moved the HEAD backward before that change, this means that our current directory is now cluttered with the change we wanted to revert.

So just manually revert that change locally on your working directory and make **another commit"

A few words about the HEAD

We said that HEAD is pointing to the last commit in the currently checked out branch. As gitglossary tells us:

HEAD
The current branch. In more detail: Your working tree is normally derived from the state of the tree referred to by HEAD. HEAD is a reference to one of the heads in your repository, except when using a detached HEAD, in which case it directly references an arbitrary commit.

I see I see, so HEAD is normally pointing to the current branch except when not! and when not it's pointing directly to a commit

But what would be the parent of that commit?

As we have moved our HEAD one commit to the past, this means that any other commit which was the future of that past point is not pointed by the graph anymore - assuming we make new commits to that HEAD~ parent commit. This means we are not appending only to the git history we are changing the history, taking parents (in our case HEAD~ and with a new commit we give it a different child than it had). For example if we move one commit to the past with HEAD~ and start commit from there the commit from the original HEAD (which was the child of HEAD~) would not exist anymore in the standard history log. So if anyone else had that child HEAD commit and used it to create new commits (new children for this HEAD commit, HEAD + 1 you could call it) it would create problems (of-course assuming we share our history rewrite with him).

So after we do git reset and go back one commit in our local repo, we usually are then making local fixes in our working directory and then we commit. This commit is a new commit but it has the same parent as the commit that we are fixing, so children are possibly different for us and for others who already have this same snapshot of the repository before our change.

So if other users already have this commit old commit of ours which we have just "detached" out of the repo and we force it into remote repo, it would cause them to get "Recovering from upstream rebase" and other nasty stuff.

We can push that to the main repository but be careful with that because this would mess up things for other guys, what would happen when they try to pull it and find that their parent is different than yours?

And the answer finishes with "you could have just done git commit --amend - amen to that but that is yet another answer that we would investigate in another post.

Not all is lost

One last note, if you want to revert the git reset you can use the reflog which stores like a log of everything you do and just like git reset you can use the reflog in order to go back your going back in time. This can be achieved with:

git reset 'HEAD@{1}'

Which tells git, hey git remember I wanted a reset, well now I want you to reset to one previous point in time just before I did the reset.

Summary

We have learned to go back in time with git reset apparently this is the second most voted question in stackoverflow.

Appendix A - Some Practice

Would you like to "prove" some of the above statements? After all we said many things about the HEAD, branch, commit, Let's see. We have a local git repository let's print the head it's just sitting there waiting for you to print it.

Step 1: Let's see what is HEAD, let's print the HEAD file

$ cat .git/HEAD # => HEAD is a file in .git directory - yeah on the base dir, let's print it.
ref: refs/heads/master # => So head is simply this line of text, this looks like a branch, let's print it.

# Step 2: HEAD --> master => OK so let's see what is master file

$ cat .git/refs/heads/master  # => Now we are printing what head points to, it should be the commit of the branch..
a15d580cc90d47a88f7f971914d45ff5a0e30eef # => So this is the commit which master points to.  But how do we know this number is a commit?

# Step 3 : Print the commit content, after all it was pointed indirectly by HEAD
$ git cat-file -t a15d580cc90d47a88f7f971914d45ff5a0e30eef
commit # => Yes git is saying this SHA-1 is a commit.  Was not persuaded yet? How about this:
$ git cat-file -p a15d580cc90d47a88f7f971914d45ff5a0e30eef
tree 7d80e5c527e9a1ec7f79f68386ce9710f1e048ce # It makes shadow like a commit.
parent ddf47ffcb19e2aee4839cae40e79fd7579fc637f # It has parents like a commit.
author Tom <tomer@email.com> 1537007909 +0300 # It has an author like a commit
committer Tom <tomer@email.com> 1537007909 +0300 # It has a committer.. like a commit

my commit message # => It talks like a commit.

# So its a commit! :)

# Step 4 : Did HEAD point to the tip of branch?
$ git log --oneline # => is the commit a15d580... really the head?
* a15d580 - (HEAD -> master) test (2 days ago) # => Yes a15d580 is indeed our latest commit where head points to!
* ddf47ff - (develop) added file to folder (9 days ago)
* 9d0e101 - hi (10 days ago)

Basically the picture is like this:

HEAD graph

What we see in the above picture is the bash commands above, we see that:

  1. HEAD points to current branch tip.
  2. Our current branch ref points to a commit in our branch in our case it's the latest.
  3. The commit points to a tree.
  4. A tree points to a list of blobs and trees (and trees in turn point to a list of trees and blobs, tree is directory).

And thus when we ask git to move to previous commit with git reset --soft HEAD~ we have asked git that our current branch which is pointed by the HEAD should point to one previous commit that's all.

Let's do the reset to one previous commit

# Step 5 : Do the reset and see it's effect
$ git reset --soft HEAD~ # Git please move HEAD to point to one previous commit.

# Step 6: Now what is the reset effect on HEAD
$ cat .git/HEAD
ref: refs/heads/master # => Didn't move! it points to the same place to the master.
$ cat refs/heads/master
ddf47ff (HEAD -> master, develop) added file to folder # => Aha so master branch pointer did move and HEAD simply points to our branch as the diagram shows.
9d0e101 hi

Summary

With appendix A we have seen that we can dig into git and see what HEAD is and not only by definition, what branch is, the effect of git reset. It's one thing to read the docs and it's actually a real blessing that we can actually see it on the .git directory, otherwise we would just need to trust the docs, and that would be really so to say, not that helpful.

Posted on by:

tomerbendavid profile

Tomer Ben David

@tomerbendavid

Check out my podcast programmers quickie - https://podcasts.google.com/?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy8xMzMwMjI0L3BvZGNhc3QvcnNz&ep=14

Discussion

markdown guide
 

Simplest answer is to use a git gui like sourcetree and look at its console. No guesswork. No fear.

 

No pain - no gain (no fun, no profit)

 

If source tree can do the work easily then why do we want take pain by using git terminal ? Anything special about terminal ?

It might be faster, more precise, and you might already be in the terminal.

 

Another great article about git on dev dot to.

Every statement was simply understandable for me except this past commit would be the parent of our current commit. That I don't get what the current commit is here? If we have moved to the last but one commit isn't it the current commit? It's like saying the current commit is the parent of itself.

I really appreciate if someone could shed light on this.

 

thanks, and sorry for that section it was not clear indeed! Allow me to try again:

As we have moved our HEAD one commit to the past, this means that any other commit which was the future of that past point is not pointed by the graph anymore - assuming we make new commits to that HEAD~ parent commit. This means we are not appending only to the git history we are changing the history, taking parents (in our case HEAD~ and with a new commit we give it a different child than it had). For example if we move one commit to the past with HEAD~ and start commit from there the commit from the original HEAD (which was the child of HEAD~) would not exist anymore in the standard history log. So if anyone else had that child HEAD commit and used it to create new commits (new children for this HEAD commit, HEAD + 1 you could call it) it would create problemns (ofcourse assuming we share our history rewrite with him).

So after we do git reset and go back one commit in our local repo, we usually are then making local fixes in our working directory and then we commit. This commit is a new commit but it has the same parent as the commit that we are fixing, so children are possibly different for us and for others who already have this same snapshot of the repository before our change.

 

So I tried to understand what tilde ~ symbol means in git and found a good article explains it.

git caret and tilde.

In short version, HEAD points to current commit, HEAD~1 points to one commit before(parent). HEAD~ is a shorthand for HEAD~1.

what the current commit is here?

Always HEAD points to the current commit.

If we have moved to the last but one commit isn't it the current commit?

Yes. Now HEAD points to the commit (which was last but one earlier)

It's like saying the current commit is the parent of itself.

No. Always HEAD~ points to the first parent commit.

The article explains the usage of ~ symbol a bit more.

 

Thank you Tomer. It now makes sense for me.

 

git reset --hard HEAD, I cannot remember how many hundreds of thousands times I used this snippet and it saved me from disaster.

Great post!

 

New comers think twice before using git reset —hard

 

Good idea mentioning reflog in this article—I never felt free and safe enough to play with the darker corners of git on production code until I found out that:

  1. git reflog will remember all previously committed states for 90 days, no matter what you do to rewrite history.
  2. If nothing else you can do a checkout to any of those hashes in reflog's output and hard reset / force push it as your new branch tip.

I know a force push sounds like bad advice to mention in any newbie git article, but the above operations were exactly the safety net that I (a decades-long user of CVS, TLA, and Hg) needed to really start getting creative and map out a mental model for git.

 

Two (better) options IMO:

'git commit --amend' to fix the message

'git revert' to undo a commit change set

Why do I think these are better? No rewriting of the history! No need to understand HEAD, graph theory, or the other abstractions. And they maintain the forward only design paradigm of git.

 

Git "gives off a smell" when you try to do something that should be easy. All I wanted to do was roll back a changeset that is sandwiched between other commits. My favorite solution to date is to branch off of Master and recreate the release branch without the offending commit you wanted to rollback.

 

Use sourcetree and cherrypick it

 

Wouldn't amend also distort commit history and mess up collaborators' repository?

 

Yeah amend messes the recent commit and creates a new one so we have a new sha-1 for that commit. if that recent commit was pushed to remote repo and other users checked it out and used it. I had the instinct that yes and that it's "reusing" git reset for amend but for safety I checked the git documentation about what it says about it and here is what is has to say about it:

git amend is a rough equivalent for:

$ git reset --soft HEAD^
$ ... do something else to come up with the right tree ...
$ git commit -c ORIG_HEAD

but can be used to amend a merge commit.

You should understand the implications of rewriting history if you amend a commit that has already been published. (See the "RECOVERING FROM UPSTREAM REBASE" section in git-rebase[1].)

 

My solution, been using it for years

alias undo_commit git reset HEAD~ --soft && git reset HEAD