DEV Community

Cover image for 3 Ways to Sync a Forked Repository on GitHub AUTOMATICALLY
Davide 'CoderDave' Benvegnù
Davide 'CoderDave' Benvegnù

Posted on

3 Ways to Sync a Forked Repository on GitHub AUTOMATICALLY

Keeping your forked repo in sync with the upstream one is something tedious, and to do it usually we have to use the command line and some git command.

But today I have for you 3 ways you can make that simpler and much less time consuming, and even synchronize them automatically!

Video

As usual, if you are a visual learner, or simply prefer to watch and listen instead of reading, here you have the video with the whole explanation and demo, which to be fair is much more complete than this post.

Link to the video: https://youtu.be/VOakLctEC2Q

If you rather prefer reading, well... let's just continue :)

1. Sync from the UI

Right, so the first way you can easily synchronize your forked repo is using the feature GitHub has made recently available directly in the UI.

You can just go to the main page of your repo, in the Code Section, and next to the indicator that says if your branch is ahead or behind the source repo, you now have this "Fetch Upstream" button.

Automatic Fork Sync UI

Clicking on that you have the possibility to compare the changes made in the source repo with the ones made in your forked repo, and also to automatically fetch and merge them into your repo.

If the changes from the upstream repository cause conflicts, GitHub will prompt you to create a pull request to resolve the conflicts.

Watch the whole demo here

2. The new API

Next method I have for you to synchronize your forked repo with the upstream one requires a little more setup, but then it will allow you to keep the repos in sync automatically. I'm talking about using the new GitHub merge-upstream API. This way is much more flexible than the previous one.

Using the API, in fact, you can start the synchronization from many different platforms: your CLI, an application you develop to apply governance to your repos, and so on so forth. And as such it will also enable you to automate the whole process, for example using a cron job or a scheduled operation.

For this example I'm gonna use curl to invoke the API.

First thing to notice is that this will be a POST operation:

curl \
  -X POST 
Enter fullscreen mode Exit fullscreen mode

Then, we'd need to specify the GitHub APIs version we are targeting, in this case let's use the v3. You need to pass that in a header:

  -H "Accept: application/vnd.github.v3+json"
Enter fullscreen mode Exit fullscreen mode

Next, authorization. The merge-upstream API requires authentication, of course otherwise everyone would be able to merge somebody else's repos :)

  -H "Authorization: token YOUR_GITHUB_PAT"
Enter fullscreen mode Exit fullscreen mode

Since GitHub is deprecating the use of username and password for API authentication, I'm using a Personal Access Token instead. And this needs to be passed as a header as well.

To know more about how you can authenticate to the GitHub's APIs, check this link.

And check this out to know how to create a PAT in GitHub.

Then we need to pass the url of the API:

https://api.github.com/repos/USER_OR_ORG/REPO_NAME/merge-upstream
Enter fullscreen mode Exit fullscreen mode

It is pretty self-explanatory, you just need the name of your forked repo, and the username or organization name that owns it.

Last step, we need to tell GitHub what branch we want to synchronize with the upstream repo:

  -d '{"branch":"main"}'
Enter fullscreen mode Exit fullscreen mode

In this example I'm telling the API I want to sync the main branch but you can specify any branch which is present in both the upstream and the forked repos.

This is how the complete API call looks when invoked using curl, using my user account n3wt0n and the repo openhack-devops-team which I've forked a while back from Microsoft:

curl \
  -X POST \
  -H "Accept: application/vnd.github.v3+json" \
  -H "Authorization: token PAT_REMOVED_FOR_SECURITY_REASONS" \
  https://api.github.com/repos/n3wt0n/openhack-devops-team/merge-upstream \
  -d '{"branch":"main"}'
Enter fullscreen mode Exit fullscreen mode

If everything goes well, and the sync is successful, we will see a message like Status: 200 OK with a response which will give you all the details of the operation:

{
  "message": "Successfully fetched and fast-forwarded from upstream defunkt:main",
  "merge_type": "fast-forward",
  "base_branch": "defunkt:main"
}
Enter fullscreen mode Exit fullscreen mode

If instead there are conflicts, the API will return Status: 409 Conflict and you will need to solve the conflicts manually before merge.

Watch the whole demo here

3. Using GitHub Actions

The final method I have for you behind the scenes still uses the new API we have just seen, but it abstracts it to the user making it much easier to use and to automate. So much so that I can say this is my favorite one, also because it uses GitHub Actions.

There are just a few actions that allow you to sync your forked repos, but this one from dabreadman is my favorite because it allows you to use GITHUB_TOKEN rather than your PAT.

The action is fully configurable but the most important parts are the following ones:

- name: Sync and merge upstream repository with your current repository
  uses: dabreadman/sync-upstream-repo@v1.0.0.b
  with:
    # URL of gitHub public upstream repo
    upstream_repo: "https://github.com/actions/starter-workflows.git"
    # Branch to merge from upstream (defaults to downstream branch)
    upstream_branch: main
    # Branch to merge into downstream
    downstream_branch: master
    # GitHub Bot token
    token: ${{ secrets.GITHUB_TOKEN }}
Enter fullscreen mode Exit fullscreen mode

The actions fields are self-explanatory. The minimum information you need to pass to the action is the original (upstream) repo url you want to sync from, the branch in your forked repo you want to sync to, and the token.

In my case I like to have this run on a schedule, so my repo should be always in sync with the upstream one (unless there are conflicts):

on:
  workflow_dispatch:
  schedule: 
  - cron: "0 13 * * 1"
Enter fullscreen mode Exit fullscreen mode

I think it should be now clearer why this is my favorite way to sync a forked repo, and also why it's usually my recommendation.

Watch the whole demo here

Conclusions

Of course if you have to sync just once in a while, using the UI is more than enough. And if you have complex requirements for busy repos or custom apps the API is the way to go.

But this last one using GitHub Actions is for me the sweet spot.

Let me know in the comment section below how you synchronize your forked repos to their upstreams, and if are going to change now that we have this other options.

Also, you may want to check out this video here, where I talk about using GitHub Actions to automate everything.

Like, share and follow me 🚀 for more content:

📽 YouTube
Buy me a coffee
💖 Patreon
📧 Newsletter
🌐 CoderDave.io Website
👕 Merch
👦🏻 Facebook page
🐱‍💻 GitHub
👲🏻 Twitter
👴🏻 LinkedIn
🔉 Podcast

Buy Me A Coffee

Discussion (2)

Collapse
petrsvihlik profile image
Petr Švihlík

Thanks for this helpful article. I took the approach #2 a one step further and created a Postman collection with all the API calls necessary to refresh the repos I work with daily.
Postman collection

Then I created a Postman monitor to run the collection regularly every couple of hours.
Postman monitor

And this is the result:

running monitor

And it works perfectly! So thanks for the inspiration :)

PS: there is a fourth approach and that is using this GitHub application: github.com/wei/pull

Collapse
n3wt0n profile image
Davide 'CoderDave' Benvegnù Author

Wow, a lot of work on those APIs :)

And thanks for the app, I wasn't aware that existed!