DEV Community

loading...

Git extract single file with history

Kamal Mustafa
Python/Django Developer at GetOTP (otp.dev), fast and simple to integrate OTP API with ready made UI
・3 min read

We have one python module that we want to split into it's own repo as we start using it in more than one project.

The first result when I searched for "git extract file with history" is from Stackoverflow question. But it doesn't really work when I try. All the branches and tags still retained. And since the approach is "filter out everything you don't want and keep what you want", it's quite slow on a large repo with very long history.

Someone also created an extension called [git-splits][2] inspired from another SO's question. But that extension seem more on extracting directory instead of single file.

Side-notes: One thing I like about git extensions is that you just create an executable called git-something and add it to your $PATH and then it will be available git something. That's one example of super simple extension API. I've created a command called git chlog that will show pretty changelog given certain tag.

Ok, back to extracting single file. Finally I found this blog post. It use git am approach, that by generating a patch from commits related to file, and the replaying it back in order to reconstruct the file from it's initial state to the current state.

git log --pretty=email --patch-with-stat --reverse --full-index --binary -- src/project/pgq.py > /tmp/patch
Enter fullscreen mode Exit fullscreen mode

We can then apply this patch to get a new pgq.py with it's history from the original repo:-

git init
git touch setup.py
git add setup.py
git commit -m'Initial'
git am < /tmp/patch
Enter fullscreen mode Exit fullscreen mode

Ok, this is the not so good part. I have to commit new file first there because when applying the patch, it failed half-way with error:-

/home/kamal/git/pgq/.git/rebase-apply/patch:38: trailing whitespace.

/home/kamal/git/pgq/.git/rebase-apply/patch:42: trailing whitespace.

warning: 2 lines add whitespace errors.
Applying: XT-596 cli function improved and tested
/home/kamal/git/pgq/.git/rebase-apply/patch:32: trailing whitespace.

/home/kamal/git/pgq/.git/rebase-apply/patch:47: trailing whitespace.

/home/kamal/git/pgq/.git/rebase-apply/patch:53: trailing whitespace.

warning: 3 lines add whitespace errors.
Applying: XT-596 fix ask_confirmation in pgq purge
Applying: fixed TypeError: not all arguments converted during string formatting
Patch is empty.  Was it split wrong?
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".
Enter fullscreen mode Exit fullscreen mode

If I run git am --skip as suggested but I don't have initial commit yet in the new repo, I'll get this error:-

cat: /home/kamal/git/pgq/.git/ORIG_HEAD: No such file or directory
Enter fullscreen mode Exit fullscreen mode

By adding initial commit, I can get passed this and run git am --skip repeatedly until all patches being applied. Of course the resulting history will be a little bit weird as the first commit in this repo seem appear at much later date than the first commit of the original history.

But other than that, it seem to work and I managed to get the file with all history intact. If you have better suggestion, feel free to share in the comment.

Discussion (0)