DEV Community

Git-Fu: merge multiple repos with linear history

Dmitry Yakimenko on February 15, 2019

The other day I invented myself a new headache: I wanted to merge a few libraries I've built over the years into one repo and refactor them toget...
Collapse
 
jpalmonacid profile image
Juan Pablo Almonacid

Hi Dmitry! I've enjoyed the post very much, particularly the use of the git filter-branch command. I recall having used it once a while ago to fix a typo in my email address (GIT_COMMITTER_EMAIL) on several commits from a couple of personal repos.
Beyond that, what I'd like to know about, because I'm really intrigued, is why you needed to merge all the repos in one repo. Are those libraries related in some way? Have you taken into account the option using git submodule to add the repos as submodules of a parent project?
Thanks!

Collapse
 
detunized profile image
Dmitry Yakimenko

Thanks! I used it a few times before myself and every time I just pasted a code snippet from the docs or stackoverflow and was done with it. This time I decided to dig deeper.

I would like to merge these libraries (not just repos), because they share a bunch of code, that for historical reasons got copy-pasted and modified a bunch of times. I would also like to harmonize their API and make them share even more code. Another approach would be to take out the shared part and make it its own library, but I find it too tedious and this library on its own would not be useful to anyone. The submodules approach is not gonna work in this case, because I'm going to move files around and make global refactoring, which wouldn't make sense in each single repo.

Collapse
 
hnsecurity profile image
Eryk

Hey Dmitry,

Thank you for the grate post and script.

I have a question about usage: We are trying to build a single master 'IT' repo that pulls all the tool/script/fix/setup/etc. repos together. For usability, sub-modules and sub-trees would be difficult to implement with our user base. We would also like to monitor the sub-repos so that when they are committed to, we can pull them and update the master 'IT' repo. I can build a daemon or a cron job to meet this need but the question I have would be regarding the state of the repo after merge.

Would I be able to use your script to keep an existing merged repo up-to-date so I could push it our primary revision control server?

Collapse
 
detunized profile image
Dmitry Yakimenko • Edited

I have not tried it, but it should recreate the merged repo every time exactly the same. So if you rerun it with updated repos you should get an updated merged repo with the same checksums. This is theory, though. It's possible that in practice things won't be so smooth. Plus it would redo a bunch of work every time and the merge process would be unnecessarily slow. It's better to modify the script to track the state and only apply new commits the merged repo.

Sorry about the late reply. Didn't see the email.

Collapse
 
hnsecurity profile image
Eryk

Thank you for replying.

Collapse
 
610yesnolovely profile image
Harvey Thompson

In the future you might want to check out Reposurgeon - gitlab.com/esr/reposurgeon

However I did almost exactly the same thing using git and scripts - incrementally improving at each step until I was done. Well worth it in the end.

Collapse
 
detunized profile image
Dmitry Yakimenko

I tried it out and I couldn't figure out how to use it in a reasonable time. I spent more time with it than with this git-only solution and didn't get even half way to solving my problem. I posted my response here: dev.to/detunized/git-fu-reposurgeo...
Thanks for the tip anyway =)

Collapse
 
detunized profile image
Dmitry Yakimenko

Thanks for the tip, I've never heard of it. I'll check it out and see if I could do the same with that tool.

Collapse
 
schoaf profile image
Schoaf

I got it running but now I get a
error: commit xxxx is a merge but no -m option was given.
fatal: cherry-pick failed

I have no idea what the problem is here. Any ideas?