Keeping your forked repo in sync with the upstream one is something tedious, and to do it usually we have to use the command line and some git command.
But today I have for you 3 ways you can make that simpler and much less time consuming, and even synchronize them automatically!
Video
As usual, if you are a visual learner, or simply prefer to watch and listen instead of reading, here you have the video with the whole explanation and demo, which to be fair is much more complete than this post.
Link to the video: https://youtu.be/VOakLctEC2Q
If you rather prefer reading, well... let's just continue :)
1. Sync from the UI
Right, so the first way you can easily synchronize your forked repo is using the feature GitHub has made recently available directly in the UI.
You can just go to the main page of your repo, in the Code Section, and next to the indicator that says if your branch is ahead or behind the source repo, you now have this "Fetch Upstream" button.
Clicking on that you have the possibility to compare the changes made in the source repo with the ones made in your forked repo, and also to automatically fetch and merge them into your repo.
If the changes from the upstream repository cause conflicts, GitHub will prompt you to create a pull request to resolve the conflicts.
2. The new API
Next method I have for you to synchronize your forked repo with the upstream one requires a little more setup, but then it will allow you to keep the repos in sync automatically. I'm talking about using the new GitHub merge-upstream
API. This way is much more flexible than the previous one.
Using the API, in fact, you can start the synchronization from many different platforms: your CLI, an application you develop to apply governance to your repos, and so on so forth. And as such it will also enable you to automate the whole process, for example using a cron job or a scheduled operation.
For this example I'm gonna use curl
to invoke the API.
First thing to notice is that this will be a POST operation:
curl \
-X POST
Then, we'd need to specify the GitHub APIs version we are targeting, in this case let's use the v3. You need to pass that in a header:
-H "Accept: application/vnd.github.v3+json"
Next, authorization. The merge-upstream
API requires authentication, of course otherwise everyone would be able to merge somebody else's repos :)
-H "Authorization: token YOUR_GITHUB_PAT"
Since GitHub is deprecating the use of username and password for API authentication, I'm using a Personal Access Token instead. And this needs to be passed as a header as well.
To know more about how you can authenticate to the GitHub's APIs, check this link.
And check this out to know how to create a PAT in GitHub.
Then we need to pass the url of the API:
https://api.github.com/repos/USER_OR_ORG/REPO_NAME/merge-upstream
It is pretty self-explanatory, you just need the name of your forked repo, and the username or organization name that owns it.
Last step, we need to tell GitHub what branch we want to synchronize with the upstream repo:
-d '{"branch":"main"}'
In this example I'm telling the API I want to sync the main
branch but you can specify any branch which is present in both the upstream and the forked repos.
This is how the complete API call looks when invoked using curl
, using my user account n3wt0n
and the repo openhack-devops-team
which I've forked a while back from Microsoft:
curl \
-X POST \
-H "Accept: application/vnd.github.v3+json" \
-H "Authorization: token PAT_REMOVED_FOR_SECURITY_REASONS" \
https://api.github.com/repos/n3wt0n/openhack-devops-team/merge-upstream \
-d '{"branch":"main"}'
If everything goes well, and the sync is successful, we will see a message like Status: 200 OK
with a response which will give you all the details of the operation:
{
"message": "Successfully fetched and fast-forwarded from upstream defunkt:main",
"merge_type": "fast-forward",
"base_branch": "defunkt:main"
}
If instead there are conflicts, the API will return Status: 409 Conflict
and you will need to solve the conflicts manually before merge.
3. Using GitHub Actions
The final method I have for you behind the scenes still uses the new API we have just seen, but it abstracts it to the user making it much easier to use and to automate. So much so that I can say this is my favorite one, also because it uses GitHub Actions.
There are just a few actions that allow you to sync your forked repos, but this one from dabreadman is my favorite because it allows you to use GITHUB_TOKEN rather than your PAT.
The action is fully configurable but the most important parts are the following ones:
- name: Sync and merge upstream repository with your current repository
uses: dabreadman/sync-upstream-repo@v1.0.0.b
with:
# URL of gitHub public upstream repo
upstream_repo: "https://github.com/actions/starter-workflows.git"
# Branch to merge from upstream (defaults to downstream branch)
upstream_branch: main
# Branch to merge into downstream
downstream_branch: master
# GitHub Bot token
token: ${{ secrets.GITHUB_TOKEN }}
The actions fields are self-explanatory. The minimum information you need to pass to the action is the original (upstream) repo url you want to sync from, the branch in your forked repo you want to sync to, and the token.
In my case I like to have this run on a schedule, so my repo should be always in sync with the upstream one (unless there are conflicts):
on:
workflow_dispatch:
schedule:
- cron: "0 13 * * 1"
I think it should be now clearer why this is my favorite way to sync a forked repo, and also why it's usually my recommendation.
Conclusions
Of course if you have to sync just once in a while, using the UI is more than enough. And if you have complex requirements for busy repos or custom apps the API is the way to go.
But this last one using GitHub Actions is for me the sweet spot.
Let me know in the comment section below how you synchronize your forked repos to their upstreams, and if are going to change now that we have this other options.
Also, you may want to check out this video here, where I talk about using GitHub Actions to automate everything.
Like, share and follow me 🚀 for more content:
📽 YouTube
☕ Buy me a coffee
💖 Patreon
📧 Newsletter
🌐 CoderDave.io Website
👕 Merch
👦🏻 Facebook page
🐱💻 GitHub
👲🏻 Twitter
👴🏻 LinkedIn
🔉 Podcast
Top comments (8)
Thanks for this helpful article. I took the approach #2 a one step further and created a Postman collection with all the API calls necessary to refresh the repos I work with daily.
Then I created a Postman monitor to run the collection regularly every couple of hours.
And this is the result:
And it works perfectly! So thanks for the inspiration :)
PS: there is a fourth approach and that is using this GitHub application: github.com/wei/pull
Wow, a lot of work on those APIs :)
And thanks for the app, I wasn't aware that existed!
There is a simple way to sync your forked repo with the upstream GitHub repo.
github.com/wei/pull
reference link
It is very useful as it creates a pull request than auto merging, but from this, this and this.
It seems like a user maintained service, so the more people use it the worse it gets for everyone, just something to consider before migrating your workflow to wei/pull.
Good news is that wei/pull can be self-hosted in that case.
That looks pretty cool, thanks for pointing that out
I keep getting this error. Am I missing something? Thanks
hey, well yes.. you are missing the name of the repository :)
that should be something like
https://github.com/your_user/your_repo
Actually I followed the instructions on the video and yet got the error