loading...

Git: Tracking Down a Committed File, with a Real-Time Ah-Ha Moment!

noelworden profile image Noel Worden Updated on ・4 min read

I picked up another git nugget this week. The TL;DR is that you can see what files were included in a commit with:

git log --name-only

If you'd like the whole story, I'll need to set the scene a bit before I get into the good stuff. When I need to reset my database on this project, its not repopulated with a seed file, but instead we use a copy of the production database. This allows us to have the most accurate data to query. Because of this, and how I setup my commands through docker, I'll temporarily have a copy of the production database in my local repo, usually called prod_dump.

This file shows up as an untracked file in git, and every once and a while my muscle memory gets the best of me, I run a git add . by accident and that prod_dump file gets included in the commit. It's happened, I'll reset, remove it, re-commit, and move on with my day. But on this occurrence it wasn't that simple. I was working through some PR comments when it happened and had run a fixup on that particular commit, squashing it into another commit. By the time I realized what I had done, I didn't know which commit contained the addition of prod_dump.

Its a big file, ~500 MB. What clued me into what had happened was that I got an error when I tried to push up the reworked commits:

I had run fixup on four or five files in one go, and I couldn't think of an obvious way to trace it back to which commit contained the huge file. I also couldn't utilize Github, because now the changes were too big to push up. My first attempt was to simply delete the now added prod_dump file, and commit that. I hoped that adding it then removing it would be perceived by git as it never actually being added in the first place. I was wrong, but not surprised. Git is still going to track that the file was added and then deleted, meaning that it would push to Github with the file, and be rejected.

I ran git log, but it only showed the commit, author, date, and message. For some reason I went to VSCode instead of a search engine and started looking around. I don't usually use the VSCode interface for git, but I thought it might have something useful. I right-clicked on the prod_dump file and saw an option labeled Open Timeline.

Clicking on that did pretty much exactly what I expected, showing the git history of that file. Knowing which commit contained the file, I could take the commit I added with the deleted file, squash it into the commit where the file was added, and now as far as git was concerned, the adding of prod_dump never happened and I could push the changes to Github.

Having solved the problem, but not liking the idea that I relied on a GUI to do it, I turned to the search engine. I was able to find the command in short order:

git log --name-only

This command presents a log, like before, but now includes the name of the files included in the commit, here it is in the docs.

The exact commits and PR this happened with have long since been merged in, but I setup a quick example so I could walk through it again with some visuals.

In the above example, running git log --name-only there are four commits, all committing one file each, with the prod_dump as the second commit. If I were to replicate the fixup mistake I described earlier, I would get something that looks like this:

Now, git log --name-only shows that prod_dump is added in the first commit.

As I said earlier, this can also be achieved in VSCode. If you go to the side bar, in Explorer and right-click the file there are several options:

VSCode File Tree with Menu

Click Open Timeline and at the bottom of the side bar a Timeline will open, in this case showing that prod_dump was added with the first commit.

VSCode Timeline View

It's funny how it has taken me this long to get to a point where I needed to track which commit contained a particular file. I guess this was the first time I didn't have Github to help with the process. There's always one more thing to learn when it comes to git.

The Ah-Ha Moment
About two minutes into staring this post I realized that I could save myself from this kind of hassle in the future if I just have git ignore the prod_dump file. I can do this by either the .gitignore file or better yet, referring to my post from last week, add it to my personal git exclude file. Should have done it months ago.


This post is part of an ongoing This Week I Learned series. I welcome any critique, feedback, or suggestions in the comments.

Posted on by:

noelworden profile

Noel Worden

@noelworden

Software Engineer in Boulder, CO - Writing code and getting strategically lost in the mountains

Discussion

pic
Editor guide