Shubham Jain

Posted on Jul 20, 2019

The Case Against Pull Requests (And How to Fix Them)

#agile #devops #git

Pull requests are an industry-standard but what if the alternative is vastly better?

Ever held an opinion so strongly that you ignored anything that proved it wrong?

I have been guilty of this far more often than I would like to admit. I didn't believe that Utility CSS was anything but an absurd idea until I tried it out. Soon, I realized I had been wasting hours writing needless CSS again and again.

If we questioned these assumptions—which I often call traps—we would realize how much productivity is being lost in doing mundane, repetitive work.

And that brings me to a great example of this: pull requests. The rise of Github made them almost an industry standard of how development should be done. Developers create branches for their changes and file requests to get them merged.

Pretty simple, isn’t it? For a stable open source project that accepts contributions, the answer is yes. It is an ideal workflow for that kind of project. But where pull requests fail developers are projects needing agility. In a fast-paced environment, they impose a heavy tax on developer's efficiency.

Consider an example: you're part of a small team and you've to fix a link in your app, what would a pull request workflow look like?

Create a new branch from master.
Checkout the newly created branch.
Create and push your fix commit.
If everything is alright, file a pull request.
Get it reviewed and merged.
Switch back to master.

All the while your branch could fall back behind master, needing you to ensure that you're still in sync. For a tiny fix, the developer had to go through multiple repetitive steps. Imagine doing several fixes like that—to correct spelling, to fix a CSS bug—and creating, checking out, syncing branches is itself a chunk of your work.

You could batch your changes, but it’s always hard to anticipate them (no wonder you have branches like bug-fixes). Additional problem is that there's an added overhead of creating release branches and merging them back into master.

Of course, you should have someone reviewing your code, but the entire overhead of creating branches for small changes is terrible—and as we would see, avoidable.

The Google Chromium team's workflow is radically different. Their developers don’t create branches at all; everyone commits on the same branch—the master. In a post, Aaron Boodman, a developer on Chrome’s team writes:

On many projects, it’s common to branch the source code to work on major new features.

…

This wouldn’t work in Chrome because we release every day. We can’t tolerate huge chunks of new code suddenly showing up in trunk because it would have a high chance of taking down the canary or dev channels for an extended period. Also, the trunk of Chrome moves so fast that it isn’t practical for developers to be isolated on a branch for very long. By the time they merged, trunk would look so different that integration would be difficult and error-prone.

(btw, this is also called trunk-based development).

The idea might sound appalling but it makes sense. Google Chrome moves at a rapid pace and isolated branches would make it difficult to keep multiple developers in sync. Keeping everything in master makes sure that “all developers are always working with the latest and greatest source.” The releases are picked from the commits in the master. Here's how it might look like:

— Source

But what about unfinished features? They are protected by flags/switches. The code is shipped but it isn’t visible to the end-users until the flag is turned on. This practice is not just particular to Chrome’s team, it is practiced at scale at both Facebook and Google company-wide.

Facebook has a tool they call “Gatekeeper” which allows them to be in control of who can see what code live on the service at any given time. As Rossi points out, right now on Facebook.com there is already the code for every major thing Facebook is going to launch in the next six months and beyond! It’s the Gatekeeper which stops us from seeing it.

— The Next 6 Months Worth Of Features Are In Facebook's Code Right Now (But We Can't See)

There’s another aspect of it: code reviews. Instead of being branch-based, they are commit-based. When a developer pushes a commit, they are required to get it reviewed by other people who have the same focus area. When it’s approved, it can get merged.

In summary: everyone commits on master, unfinished features are hidden by flags, and code reviews are commit-based.

In contrast to pull requests, this model is superior in multiple ways.

It avoids the overhead (and cognitive load) of creating branches and merging them back.
It dramatically reduces the chances of messy conflicts.
Release workflow is much simpler since you can just pick a commit from master and mark it for release.
Making code reviews commit based ensures that it’s easier for the reviewer to review changes (as opposed to a massive pull request).

The main distinction is that you see commit as the unit of change, not the branch and that makes all the difference.

The next question is can you adopt the same workflow as Chromium’s team if you’re using Github/Gitlab?

Unfortunately, I couldn’t see how you could have the same style of code reviews as Google’s. Gerrit is an open-source tool but it's almost impossible to use it with conjugation with Gitlab/Github. It has to be installed on a server and the UI isn't pleasant to use.

But it’s possible to build an alternative. There could be a tool to which developer push their changes first where other team members can review them. You can't push your commit to master until your changes are approved.

If there are any suggestions, the developer would need to modify the commit (using amend or interactive rebasing) and push it again. This gives a clear view of what was changed and the reviewer understands if their suggestions were incorporated. When the reviewer approves the commit, it can be pushed to master.

You get benefits of code review without sacrificing your whole setup. Sounds interesting? I am prototyping something like that. You can fill this form if you would like to try it out when it's ready

Top comments (14)

simonhaisz • Jul 20 '19

Alternatively, just create small PRs that always target master?

Creating and switching branches takes seconds so I'm struggling to understand how that can be considered a burden or a big overhead unless you were doing it thousands of times a day. I think the most PRs I've submitted in a single day was 10 and I can imagine a workflow where someone might need to do 20+. That's maybe 5 minutes of their day switching between branches? Seems like there are likely better opportunities to improve productivity to me.

Shubham Jain • Jul 20 '19 • Edited

Consider this scenario: you've filed a PR and moved on to the next one. Then, you realize there could be some improvement in the last PR. You would have to checkout the branch again and create a new commit and checkout the last branch again. If you're working on the same branch you would just have to amend the last commit. Another case could be where your PR is not merged and then, you're working on another which might have conflicting changes.

I agree that the cost of creating branches is not that high, but it's the context switching that can be a time sink. I think Google's workflow is about worrying mostly about your change and little else.

Yawar Amin • Aug 19 '19

it's the context switching that can be a time sink

But in the commit review world you'll be constantly context-switching as your teammates push new commits and you have to drop everything and review them to unblock them from pushing their next commits. Especially for a smaller team and for teams with more juniors or newer teammates (who tend not to review other people's PRs), this will bog down everyone's commits in a big queue waiting to be reviewed.

simonhaisz • Jul 20 '19

Your first example has you making no changes locally after creating the PR. You would only be switching to a new branch if you had other work you needed to do. If we assume a model of "I'm going to stay on this work item until its in master", which it looks like to me you are, you would just stay on this branch until your PR is approved. If there is feedback that requires changes you make them and push to the remote PR branch. If your teams doesn't like the fact that there are multiple commits, can amend/squash locally and force push to your PR branch. I have some co-workers who feel strongly about re-writing history so that their merges are clean (single commit), which fits this case.

And if you are working under this model on work items in serial (one after the other, never moving on to the next item before the previous is finished) then you can accomplish that without local branches, only remote PR branches. You can do all of your work on your local master branch but when you push, instead of pushing to remote/origin/master you can go to remote/origin/shubham/JIRA-1337.

As to your second example, you can avoid conflicting with yourself by using PR chaining. I do this sometimes when I have a series of changes that depend upon each other. After the first change I create PR-A that points to master. Then the second is PR-B pointing to PR-A, and so one. This allows my reviewers to review my changes in bite-size chunks and unblocks me from having to wait for them to review each change before starting on the next. If I need to respond feedback on a particular PR I can easily switch to that branch and make changes. The downstream PRs can then get updated appropriately.

Now, obviously the above can easily get complicated but I consider that complexity 'worth it' because that's how I stay productive. My reviewers are not going to drop everything they are doing and review my work immediately - that wouldn't be good for their productivity with context switches outside of their control. Sometimes PRs will get reviewed within a few minutes of creation and that's great! But other times is could take a few hours before they have a chance.

So if you live in a world where you are making dozens of commits a day and getting immediate, useful feedback on each change I would get your position. I've just never lived in a world anything like that before :)

Daniel Martín • Aug 19 '19

Hello, thanks for the post, I really enjoyed it and I think there are good points there. In general, I’m a fan of reducing ceremonies and boilerplate as much as possible (without losing control), so this approach fits well that way of thinking.

There’s something I didn’t understand, do you mind help me fill the gap?

The post says:
“You can't push your commit to master until your changes are approved.

If we only use master, but we can’t push to master, where do we push to at the beginning? Where does the reviewer do its job? I didn’t understand either that we can’t push to master but then we need to amend the commit and push it

I got lost at some point but I don’t know exactly where.

Thanks again for the post!

Shubham Jain • Aug 27 '19

The implementation may vary but the idea is there is temporary space where your commit is stored, and it's cherry-picked on the target branch when it's approved. This is very similar to how gerrit works. Commits can only be merged in the target branch once it has been reviewed.

The tool I am talking about can probably send a diff to a third party service or can create a temporary branch for your commit. . That'll be an abstraction.

Nathan Johnson • Oct 4 '19

We use Gerrit where I work, and I generally like it. The UI is tolerable with the newer beta design :)

The issue I have with it is that it only allows single commits to be reviewed. I find this to be limiting, as it requires one to amend a commit to address code review feedback. You also lose all the history of the changes in git. Gerrit does store the history of a review as patchsets whenever you push the amended commit with the same Change-Id and provides diffs which is nice, but this history is not stored in git itself (afaik).

Peter Ellis • Aug 25 '19

This is what tools like Gerrit are for.

But the basic idea is that you don't actually push to master, you push to a review queue instead, where you can still amend your commit, and when it is finally approved, it is then merged with master.

And yes, this could easily mean that you've got merge conflicts again, so you're back to square one. Which is the Achilles heel of the whole idea, you now have to do the very thing you were supposed to avoid.

There is a difference though.

All this extra work is only needed when a commit fails the review. So if 99% of your commits pass review, all you're doing is pushing to one remote and pulling from another, which takes no extra effort in practice.

But if you're working with a "novice" team who are likely to fail reviews often, this workflow is a nightmare. And if you've only got two or three devs actively working on a codebase, it's probably an overkill.

So the important thing as always, is to think about what works for your specific circumstances, try your idea, then continuously evaluate and refine it.

Pablo de Oliveira Castro • Aug 19 '19

Excellent article. It gives everyone the notion that exist many ways to do same thing. Solving a problem isn't a silver bullet thing. I see outstanding value on this approach as I see for PR/MR flow. it depends on the kind of problem you're trying solve and the project you're working on. Thanks to bring this to the light for me.

Thomas H Jones II • Jul 20 '19

Overall, it depends on the code and how your team works on it.

In my organization, we have lots of individual projects. And, while each of us can contribute to a given project, there's usually a primary maintainer (or small set of primary maintainers). Further each project tends to be composed of discrete modules. Modifications – and associated PRs – tend to be restricted to a given module.

Typically, we use a fork-and-branch model for our activities. Because of the modular style of our coding and committing, merging tends to be at low risk of having conflicts to resolve. Basically, if you have even three developers simultaneously modifying a given project during a given time-period, it usually works out to: developer A was probably modifying module X; developer B was modifying module Y and developer C was modifying module Z. Which is to say, even though their forks' branches will be out of sync with each other, the merge is usually conflict-free (and while there may be module dependencies, so long as people focus on not altering their module's inputs and outputs in a way that breaks dependencies, those dependencies usually aren't endangered). Add in pre-commit linting and post-commit testing-tools and you can maintain a fairly high "branch → modify → commit → PR → merge" tempo (usually, there's teammates quickly available to do code-reviews because they're waiting for post-commit testing-tools to run ...and, because the commits tend to be small, the reviews tend to be fairly quick and short on pre-merge modification-request back-and-forth).

Obviously, much like not every project can be run like Chromium, not every organization's projects can be run like we (tend to) run ours. The parenthetical in my prior statement is there because, there are a non-trivial number of times where we have to deviate from preferred practices. However, those deviations are still infrequent enough as to not require us to globally change practices.

Miodrag Milić • Jul 20 '19 • Edited

You are totally right - PR/MR is overkill in certain situations that are more common then the opposite.

I am looking forward for the tool you write since Gitlab code review is totally useless in that context. You can't even review entire file outside the commited lines.

Andrea Laforgia • Aug 22 '19

Great article. I recently wrote an article on LinkedIn about how code reviews are greatly overrated, in my opinion. Your article summarizes the way I'd like my team to work. It takes time to switch a team's culture from a GitFlow-like branching strategy (horrible, in my opinion) to a trunk-based one, but it can be done.