Hi there! If you're new to Git, or just don't feel that confident about using it in general, then I hope this post is of use to you. As you can already see, there's a big wall of text here without much in the way of illustrations yet. I'm big on over-providing context; it's kinda My Thing.
If you feel good enough at Git but just not rebasing, you can safely skip down to Understanding this One Weird Trick about Branches
Before we get going, I want to give a quick shout out to the book that helped me grasp Git concepts and fundamentals when I first read it 9 years ago:
If you buy that book, you can forget about this post entirely!
Also shouts outs to front-end development extraordinaire Dan Cortes for reviewing this article and helping me polish the finer details. Thanks, Dan!
You may be someone who has had a particularly bad experience with Git rebase before, with those awful merge conflicts wasting your time and mental energy. Don't worry, I've got tips for making your rebase life easier at the bottom of the post.
If you're thinking, who cares, I'll just use merge it's easier, well ... fine. You're right. Just use merge and forget about this post. You'll be fine, your team and project will be fine, nothing terrible will happen if you don't learn how to rebase. However, I hope to encourage you to learn a little bit more about the fundamental tool of your work that is Git!
I've elaborated on this a bit more below, but one issue with always using merge is that it may result in a harder to follow commit history. Having a simpler timeline of changes can go a long way toward alleviating your efforts in the future. For example, if you wanted to isolate a set of changes it is a lot easier when the commits aren't mixed in with unrelated ones, which is what merging instead of rebasing can end up doing to you.
This post assumes you've had some experience with Git and most likely have tried rebase before and hated it and yourself afterward. But even so, I've used some terms in this post that I think are worth defining to avoid leaving beginners in the dust.
The branch you were on when you typed
git branch my-branch-name # or, if you want to switch to # your new branch at the same time: git checkout -b my-branch-name
This is the branch you "rebase onto" and "merge into" as it pertains to your child branch
The branch you are working out of that you want to keep up to date with new changes from some parent branch (almost always the branch you originally stemmed off of in the first place)
A common UI feature of source control web apps, used by teams to allow fellow devs to review your work before merging it into the parent branch
Your team reviews the code that you're intending to merge via your pull request
SHA, commit SHA, or SHA-1
The 40 character unique hash string that is made for each commit (ex: f4f78b319c308600eab015a5d6529add21660dc1). It's actually the name of the algorithm that creates the hash, but devs commonly refer to the string itself as the SHA. Most services usually shorten these, when displaying them, to the first seven characters (ex: f4f78b3) because that will always be unique enough in a single project. If you need to reference a commit SHA with some Git command, you can use the shortened version and Git won't bat an eye.
The latest SHA / commit point of a given branch, tag, or other such Git reference object
The currently active head - so a Git repository has multiple heads, but one HEAD
When you and a teammate make different changes to the same lines of code, Git will have a harder time resolving the changes automatically. Rebase will stop and ask you to resolve the conflicts manually before continuing.
A team-specific process for getting a desired version of the code onto a remote server environment such as staging(testing by internal staff) or production(seen by actual customers)
Please request any other terms used in the post that you'd like me to define, and I will add them!
I'd love to see more reasons in the commments, but in my experience, rebasing is all about keeping the commits in your branch together and on the top of the commit history for pull request / code review time.
If you merge when a rebase would be more appropriate, it creates these false merge points in the commit history. When you look back in the list of commits you may not easily be able to pick out only the merge points from approved pull requests, for example.
Rebasing keeps commits in a logical order and doesn't mix them from one teammate's work all up with yours when your two branches are merged in to the main parent. This makes it a lot easier to follow the history of project features and code changes.
Basically, whenever there are new changes on the parent branch, and your own child branch hasn't been merged back in yet. Depending on the size of your team and how active they are, you may have to do it every day or multiple times a day. The more often you rebase, the less frequenty you will have trouble with merge conflicts, and the less likely you will be to work on code that became outdated since you created your branch.
Ultimately, this is something you can trust yourself to feel out over time and develop your own intuition for.
It helps me to think of merge and rebase as commanding different directions for the changes to go. Merge goes from child to parent, or, child merges into parent. Rebase goes from parent to child, or, child rebases onto new commits found on parent(or an entirely different branch!).
The first thing that I believe will help you understand rebasing better is: in Git, branches are just text files that say which commit SHA is the newest one for that branch. While Git is rebasing, it uses this trick to figure out how to, as you may have seen before, “rewind” your branch and "replay" its commits from a new point.
Git rebase makes it as if you had branched from the newest commit on the parent, instead of that original commit you were at when you first created your branch. You are changing your branch's base commit, or, re-basing your branch.
Okay. So you're at the latest point of your main branch (main-branch)
Here we see 3 commits at the top of the history, labeled A, B, C, and "main-branch" is pointing at commit C
Then you create your branch with
git checkout -b new-branch
and you see that Git points new-branch also at commit C
Over time, you add two more commits to new-branch, D & E
And while you were doing that, perhaps you checked out back to main-branch, pulled, and found two new commits.
> git checkout main-branch > git pull # ... new commits arrive!
This isn't the only method to detect changes on the main-branch, but I don't want to distract you with that right now. (psst. hey. kid. c'mere)
So now it's time for the big moment. You checkout new-branch again and run the rebase command.
> git checkout new-branch > git rebase main-branch
...and Git does the following behind the scenes (more or less, this is over-simplified on purpose)
Git looks backward from each branch's head through each commit until it finds the first shared point between both branches (hence the arrows showing the relationship of the commits and the extra red arrows showing the search)
Git creates a hidden, temporary branch and points it at C, the part where you may have seen "rewinding" in the terminal
Now that Git knows all the missing commits from your new-branch, it points the temporary branch to main-branch's head, at G
Next, Git "replays" the commits from the new-branch from commit G, adding commits D2 and E2
Finally, because we magically had no merge conflicts
"cmon man, please help me learn how to do that part!" ... don't worry, that's coming soon (-:
Git discards the temporary branch and points new-branch to E2
In this post, we took a simplified look at how Git rebase works for a typical development scenario. In truth, rebase has a lot more use and power outside of just the one I've illustrated here, but I hope this provides a fundemantal framework for approaching it with little to no experience. Together, we covered:
The general why and when to use rebase versus merge. My personal rule of thumb is: merge goes from child to parent, rebase brings changes from parent to the bottom of the child's history
That branches are just hidden text files that Git uses to know where a branch's head is
At a glance, what Git does in the background during a rebase, how it "rewinds" to the first shared commit between the two branches, and "replays" the child branch's commits from the new head of the parent branch
I know some of you may be wanting to learn more about handling merge conflicts and I do plan to make a screencast for a follow up post. Until then, I have a few tips from my personal workflow that may help you avoid merge conflict hell more often than not:
If you use VSCode you can set it as your merge conflict editor. Other nice GUI editors support an easier merge conflict UI as well, so I'd love to hear from other commenters how to configure other editors:
Assuming that you do use VSCode and have installed the "code" commandline tool, you can add these settings to your "~/.gitconfig" file:
The last one for keeping backups is my own personal preference that others may not agree with. I prefer it because in all my years of dev work, I've never needed them and I got sick of cleaning them up all the time.
An alternative here would be to add ".orig" to your project's ".gitignore" file:
Consider setting all branches to "pull" with rebase by default:
Are you sick of repeating the SAME merge conflict over and over again when you're in rebase hell?
Use Git's Reuse Recorded Resolution setting:
[rerere] will probably be the number one life saver for most people who have tried and hated rebasing in the past.
Rebase often - the longer you wait to rebase your branch on new changes, the higher your chances of having tons of merge conflicts.
Consider squashing your commits with interactive rebase - when you run
git rebase -i main-branch
...the "-i" option means interactive. It will show you all of your branch's commits in an ordered list (earliest on the top).
The screen will provide instructions for different options for each commit, with the default being to "pick" the commit, meaning to keep it for the "replay" part.
Another option is to "squash" it with the one above it. If your branch is super old and has tons of commits, you may find it helpful to squash it down to one so that your merge conflicts can only happen once (because only one giant commit is being replayed). This is something you'd want to discuss with your team first, though.
Well shoot. That's about all I can muster. What do you think? Has this been helpful? Can I clarify or elaborate anything else? Fellow code veterans, am I way off base, or could I tweak some of my information to be more accurate? All comments, questions, concerns, and feedback are very much welcome, and I thank you for your time today. Happy coding!