Over the past year, while building Pullpo, I've been highly focused on one specific area: code reviews. I've analyzed and interviewed dozens of teams of varying sizes, industries, and countries to truly understand the best practices surrounding code reviews.
Why?
Code reviews are important.
Many people think that code reviews are only for improving the code, guaranteeing that the requirements for different stakeholders are met, it may be security, maintainability, performance or only functionality, depending on the environment.
But code reviews is not only about improving the code, they are many times the best opportunity of improvements and new learnings for developers inside the team.
Also, recent research (DORA 2023 report) has confirmed that improving code reviews can speed up software delivery performance up to 50%. Take a look at my summary of the report here.
Code reviews should be FAST.
This is crucial. Typically, unless you use something like SHIP, SHOW & ASK (which I recommend), all code reviews are blocking by default.
This is, until the changes meet all the requirements and get the necessary approvals, the code is not going to production.
This is something super important: We should optimize for the speed at which a team can ship product together, not for the speed at which a developer can write code. In fact, this is the first phrase of Google’s engineering practices document.
This translates to replying to code review messages as fast as possible.
Code reviews shouldn’t break your FLOW.
Being in the flow state is that time when you are super focused on a task: it can be designing, programming, whatever you are doing. Some DevEx frameworks (such as the one presented in the paper: Devex: What Actually Drives Productivity) place the flow state as a top priority subject for developer productivity.
The devex triangle from the paper DevEx: What actually drives productivity?
Flow state is very important for the efficiency of the group, so if you are a developer and you are in flow state, try to keep yourself in that state as long as you can, avoid switching tasks and as long as you can, avoid interruptions.
Being in flow state is the only reason to pospone replying fast to code review messages.
I love this image I found on reddit about the dilema between flow state and interruptions.
Credits to nkukard.
Communication is IMPORTANT.
Normally code reviews are made in a written format. Humans communicate much worse in written format. Your corporal movements are not there. Your voice tone is not there. So misunderstandings are much more common.
This is why some new communication frameworks are raising such as conventional comments. Similar to conventional commits, but applied to code review comments. Where you explicitly specify the intention behind your comments. For example:
question (non-blocking): At this point, does it matter which thread has won?
Maybe to prevent a race condition we should keep looping until they've all won?
Where non-blocking means that this comment is not blocking the PR from being accepted.
Some other tips in this area are:
- Ask questions instead of affirmations.
- Be kind.
- If you have more seniority, spend time on mentoring, this pays off exponentially.
- If you not agree on a specific topic with a teammate:
- Don’t take it personally.
- Listen carefully to the other explanation.
- Discuss it in person or in a call, where effective communication is easier.
PRs should be SMALL.
PRs/MRs should be as small as possible. Only one change at a time. Only one purpose at a time.
We’ve all seen the meme and know it’s true:
credits to twitter.com/iamdevloper
Large changes make async code reviews super slow if we want to keep quality, or super low quality if we want to keep speed.
So what can we do when tasks are huge and we need to implement many changes to accomplish it and we can’t divide the tasks into functional small subtasks?
-
Stacked pull requests. This is a great alternative. Basically it means dividing the task into reviewable (<400 locs) PRs. The downside of this approach is that since these PRs depend on each other, a change on the first branch requires updating the depending branches too.
This fact can make you lose some time synching the depending branches manually with ‘merge’ or ‘rebase’, this is why some tools where born such as git-town, ghstack or graphite.
Credits to Dr. McKayla
- Pair programming. Some companies use pair programming as the default way of developing. This also means that the code is already being reviewed by 2 people while being developed, so you don’t need to do a conventional async code review afterwards.
Notes on pairing.
In the past, some recognized authors such as Dave Farley, author of Continuous Delivery, claimed a controversial idea (read comments on video or here) that code reviews through PRs are just a bad idea.
In this video, Dave proposes pair programming as the best code review solution for every case, for every type of company, every type of developer and every type of task.
I follow Dave, and I agree on most of the things he says but most of the time things are not just black or white, but somewhere in the middle depending on different factors. Pairing is one of those things.
As a developer, you need to know yourself in order to understand in which scenarios you feel more productive pairing, with which type of people and on which type of problems.
In my case, typically, I only feel productive while pairing in the following scenarios:
- Mentoring a new person on the team.
- Brainstorming creative solutions for a complex problem.
However, it’s very hard for me to achieve flow state while pairing, and as we discuss earlier in this post, flow state is a key factor on development productivity.
Final thoughts.
- Code reviews need to happen fast, this is replying fast to code review messages.
- Being in flow state must be the only reason to pospone code review messages.
- Communication is important. Be nice. Listen. Discuss in person/call if necessary.
- Only one purpose per PR. Small PRs. Stacked PRs if necessary.
- Try pair programming with different people and tasks. Discover when pairing is useful for you (and your team).
Top comments (18)
This was a great read, very informative. I'll bring this to my team, though at present we aren't too slow, there are things we could learn.
Thanks a lot Mike! I also tried to put this knowledge together into this code review solution: pullpo.to/channels
Let me know if it could be useful for your team too!
I wish everyone would read this one
Definitely gonna promote conventional comments from this point on
+1
Thnx! Conventional comments are super useful to make explicit the intention behind the comments.
Also, I made this code review tool that integrates the code review into 2-way temporal Slack channels: pullpo.to/channels
Misunderstandings are less common on Slack since you can have real time conversations, huddles, images, voice messages and even gifs haha. The cool thing is that the tool always syncs everything with github or your git provider, so conversation is in both places.
Let me know what you think if you try it!
Frankly, we're doing most of the discussion of a review in Github itself. If we need to go into detail, there's usually no way around a brief 4-eyes session.
Great tool you built though, I can imagine it's useful for teams that are yet to streamline their collaboration
[praise] This is gold. Thanks a lot! And +10 ❤ for conventional review comments!
Thanks Andreas!
I don't really understand this "pr should be small" hype. I think sometimes it makes sense and sometimes it's totally fine to have large PR especially if it delivers new functionality in one reviewable chunk. It's easier to make meaningful review when you see it in whole as opposed when you see it broken down in 10 smaller bits. What I see often with small PRs is that obviously really bad design choices constantly endup into master because there is never a point where anyone can say no to it. And it's very rare that in any typical corporate environment anyone would endup addressing it. I personally don't understand what is it so hard to review larger pull requests let's say couple thousand lines of code, is it really too hard to read that? Sure it takes some effort, but reviews should take decent amount of effort otherwise it is not really a review. Putting 0 efforts and just dropping personal 2 cents on how to write a line of code I don't think that is actual review. I much rather prefer reviewing big prs that deliver full implementation of a feature that I can test and review it's design fully. Whilst the meme is how "10 lines of code 10 issues, 500 lines of code, 'fine'" the way I often see it instead is that on small prs often there is hardly ever anything to say apart from lots of bikeshedding on unimportant stylistic choices that really don't matter either way, and on '500 lines of code' if someone just says "fine" well you at least know who didn't actually reviewed the code.
Interesting opinion! My thoughts:
Design decisions should be made before starting the task. In any case, the early in the process you can detect design improvements the better. And dividing a big task into smaller PRs lets you start reviewing the first PRs even when other depending PRs are in WIP. That is better than having a completely wrong huge PR.
It's harder. It takes more cognitive load (you need to fill up more your ram). This makes devs say "I'll do it tomorrow" or "I'll kind of read it now and post a lgtm" rather than "meh, its just 200 loc, i'll take a look now", this makes the difference.
Btw, with stacked PRs you could always wait until the last PR is made and see the whole picture. But if you create a big PR you can't automatically divide it into easy to read single purpose chunks.
It's ok to not have anything to say on PRs. All engineers in the team should know what to look for on PRs, regardless the size of the PR. Let linters do the nitpicking!
To ensure that you don't endup something having to rewrite entire thing you can always reach out to your colleagues to check your draft work-in-progress long before it's in ready stage.
Stacking PRs sounds like treating individual commits as pull-requests. Don't see point in that. It feels like it's just adding red tape.
I don't really agree with idea that just because it's "small" then somehow it saves cognitive load. Big pr does not necessary mean it's in total more cognitive load. It's just in one piece. Dividing it into many small pieces is same load just spread over days and adding additional overhead of context switching. There is also less of a problem that review is late by a day or two in case of bigger pr because in my experience it can easily be a day or two faster to get larger PR ready for review compared to what it takes when shipping same thing in tiny bits: often endlessly switching between branches, fixing code reviews comments, switching again to work on something else, switching back, rebasing, switching, rebasing, etc.
Anyways I think there is place for various approaches and it very much depends on context. For backend systems I think often smaller pull-requests are necessary because small incremental changes are often a necessity to deploy changes with zero downtime. For instance something as simple as adding a new field in database might require multiple pull-requests: 1) adding nullable field 2) adding code changes that fill field in newly arrived data 3) migration to fill in legacy rows 4) adding database constraint. Such nature of work forces to work in very small changesets. I don't think it's efficient way to deliver new functionality in fact it's very slow but it's a way that's forced by environment and not much can be done about it. In contrast frontend changes often don't have such constraints and so it's often fairly easy to deliver larger change that implements new functionality fully and there is hardly any drawbacks and is just faster way to build things.
I'd encourage to stay true to agile manifesto. Focus on delivering user visible stories not code. And always remain pragmatic about the workflow. And since no environment is the same thus no process is going to be universally better than other. If it's faster to deliver user facing functionality in large code chunk then do that. If it's faster to deliver it in small chunks then do that. Nothing wrong about large pull-requests, nothing wrong with small ones, as long as it ships user facing value. Yet I think it's important to recognize that splitting work into many tiny pieces while has certain benefits also comes with costs due to extra red tape and context switching thus it's not a clear cut always better approach.
Thanks:)
Thanks!
Perhaps you could you elaborate on what you mean under fast, small, one thing and other relative terms? I.e. how fast your code reviews are on average, how do you measure small and one thing per PR in your team, etc? It would be interesting to see some concrete numbers or estimations.
In general:
Fast: 1 day. If your team is distributed in different timezones it may be harder, but it should be as fast as possible. This is also what google eng practices recommend:
However this may be impossible if you are writing code in a highly regulated industry.
Small PRs and single purpose PR. These two terms are actually related to each other.
Doing it this way with Stacked PRs allows:
1.- The PR "add DB table" could be reviewed while I'm working on "Add frontend feature A". This allows detecting improvements early in the process. Also, depending on the case, this first PRs could be even merged to production, if it gives some value alone or at least doesn't break things.
2.- Smaller PRs -> Better and faster code reviews. Just because it's less cognitive load. It's easier for the reviewer.
There is this great book called atomic habits, where James Clear says that there are 4 laws of behavior to do things and create habits. Attached screenshot below.
Smaller code reviews are easier and more attractive.
Thanks!
Thanks for figuring out the complexity of "it depends" decisions and how to balance timely communication with not interrupting the flow state.
Great post!
I would like a couple of thoughts on code review productivity:
1) Use ship-show-ask inside one PR. Sometimes a PR is already approved, but you have to make a small change.
If you have after-disapprove, then making a small fix becomes very expensive, which leads to developers avoiding small changes and worse code in the end.
On the other hand, if you do not have auto-disapprove and realize that you have to make a large changes that you would rather had reviewed, you have to ping all the reviewers manually.
So, an ideal code review tool will make PR dis-approve controlled by developer per commit.
If it is not possible, I would vote for NOT using auto-disapprove.
2) A clear procedure for assigning reviewers is important. A reviewer should be assigned fast and they should have a clear understanding that reviewing the code is their responsibility. I have seen cases when multiple reviewers are assigned and all of them wait for another guy to do the review.
Multiple reviewers are fine only as long all of them have a clear understand of the responsibilities.
3) As much as possible should be automated, but not more. I have seen more then once developers making changes just to satisfy sonar rules they do not understand or embrace.
And yes, one more vote for conventional comments!
Some comments may only be visible to logged-in visitors. Sign in to view all comments.