Pull Requests, from now on referred to as PRs within this article. That unsexy but necessary step that sits between your wonderful shiny new commit and your master branch. PRs are unsexy because we actually know that our code is neat and tidy, and we don't want those evil code reviewers to come up with all those useless and nasty comments about our beloved codebase, or do we?
Of course, we do. Code reviews are not evil, in general :) Code reviews are a way to spread knowledge across the team and also help to get several extra pair of eyes into your code. Some people may argue that pairing is an alternative to a code review. And that is right, as long as you are good with only one person of the team knowing about your piece of code. And yet, you might pair with someone for a few hours and then that person can still come up with something totally different to what you worked initially in. So, code reviews are an opportunity to make sure everyone is on the same page and also to get quality feedback for your work.
Pull requests' quality is often overlooked. A badly written pull request can not only cause friction when it comes to code review ( arguments, fights, time wasted and ultimately burn down), but it also decreases productivity and increases the chances to introduce issues in production. Clean and tidy Pull Requests contribute to a faster pipeline, quicker code review cycles and increase the overall quality of our product.
So, here are some tips on how to write effective Pull Requests ( Note that the samples in here are totally fictitious).
This is a big one for me. Probably the most important. If you have to pick one tip, pick this one.
All of us, are busy people. Right? We have tons of stuff to do. Then, have you ever received a Pull Request that is changing code (i.e. not a trivial typo-fixing type of thing) and has no tests? I have. Plenty of times. And these days, my reaction to those is pretty much the same: "Please, can you include some tests that prove that the change you are doing is indeed verified".
Why am I being so harsh? Well, really because, with no tests, I am actually forced into doing a big effort trying not only to understand the logic but also to figure out any minor thing and dependency that could actually break the code. This might happen especially if you are the subject expert in some area. In such case, there is a considerable chance that you are always going to get these type of PRs where the submitter actually just wants you to verify the overall approach and the source code before even bothering to write any tests for it. Needless to say that this is bad. Quite bad. Don't take me wrong, there is nothing bad on asking for help, but there are better ways to do it like asking for a pair programming session or doing an unofficial code review in pair. That way, the reviewer's time will be used more effectively than with an out of the blue totally asynchronous PR.
By writing tests along with your PR you are providing the reviewer actual proof that you have gone through the exercise of testing the code yourself. The tests you write should go beyond the usual happy-path scenarios. Again, by providing negative testing, you are giving proof that your homework is done and that you have been thinking about what might happen when things go wrong. Tests will also help you with your logic and they will surface hidden issues that might go unnoticed.
Most importantly, tests will provide not only proof but also great documentation for your reviewer about what you are trying to prove and why you think you are indeed proving it. And both things, the why and the how provide fundamental information to decide whether the PR is valid or whether it might be incomplete or just missing some part or be missing entirely the point.
This sounds kind of trivial but it's actually common and overlooked a lot of times. Especially when working on a single-person project when you are rushing and don't put that much attention on the details. But again, if you think about the future, chances are people will join later and if you don't put much attention on the titles then you end up with things like this:
Can you tell what I was doing there? Well, you can tell I was refactoring some code, doing some bug fixing, but you have absolutely no clue about what the changes were. Now, that is a fictitious example. But, as a general practice, something like that implies having to click on each and every PR to figure out what is going on. Imagine that you are new to the project. Exhausting.
A title needs to be short, concise but at the same, it needs to communicate the overall purpose of the change. If you cannot summarize your PR in a single title, then it's very likely that your change is doing too many things and at that point, you should reconsider whether you should be splitting your work into multiple PRs.
PR's descriptions should be meaningful. Bear in mind, that after the PR title, the next thing the reviewer will see is the description. A good PR description should tell us:
. What you are trying to do.
. How you are doing it.
. Why you decided to do it that way.
. How you are supporting this PR.
For example, compare these:
With this one below:
So whereas on the first examples there was no description, or some cropped description from the PR title or a totally useless description pointing to some demo happening in the future, here, the engineer gives some good context about the problem, explains why this PR was needed and why it has been done that way. And finally, it also provides some information about why the PR is safe and the tests attempted. Probably not perfect, but certainly more useful and explanatory than the other examples.
Who does not fear the mythical 100+ files PR? I've done some of those and I'm not proud of it. Huge PRs are usually the result of long-lived branches, big refactorings, heavy new functionality or simply could come from extremely creative contributors that have that ability to create complex code and tests very quickly. However, none of those are real reasons justifying huge PRs. Even if we are fortunate enough and have highly skilled engineers that can create code as fast as a kid can devour a candies box, that does not mean our reviewers are going to have that same mental ability to process PRs with hundreds of modifications.
On the other hand, if every code change is going to be a single PR, then that puts a lot of burden in the code and PR review process and might impact development with constant interruptions. Unlike with huge PRs, there are occasions in which there is no other choice than having a tiny PR as maybe we need to change just one file and very little of it. That's totally fine. What we should not do is, for example, to create five different PRs with one file each for some functionality that is cohesive enough and logically related and hence could fall within the same PR.
I will not set an ideal PR size as I don't think there is one, but I believe as soon as we go over 20 files it starts to be complex for a reviewer to actually have a mental map with all the changes and consequences that these changes carry over. Having tests alleviates this issue, but before sending a PR that is starting to feel large then we should ask ourselves if we would be comfortable reviewing that many files or not. In case we feel that it would be too much for ourselves then we should be considerate with our reviewers and divide the PR in smaller related chunks.
Single Responsibility Principle is in some sense also applicable to PRs. Just as you want a module, class or function to have and to encapsulate a single responsibility, you want your PR to encapsulate and own a set of logically interrelated and encapsulated changes. In other words, you don't want to have a PR containing multiple logical changes that are not related.
One typical problem with merging non-related changes in the same PR is that if the code review takes longer than expected then all the functionalities that are enclosed within the PR will get blocked. So, imagine an urgent bug fix committed along with a non-related PR that has several parts that need review. The important bug fix will get blocked while the PR is fixed. Whereas if you submit two separate PRs then the quick and easy bug fix can proceed to the master branch while the longer and more conflictive PR gets reviewed.
Typically this issue might happen, many times involuntarily, when you are always working from the master branch on your fork. That is perfectly fine if you are going to deliver sequential changes. But as soon as you foresee multiple parallel changes then you should consider switching to temporal local branches on your fork which will allow you to submit multiple independent PRs.
Changes like new lines added or removed, spaces, tabs, etc. add which don't change the file at all, are just adding noise to your PR and make it look longer than it is. You should avoid this as such changes contribute to your PR to look messier and careless. And you are not careless, are you? Honestly, when I get a review where I see those patterns my internal self becomes immediately wary about the whole PR. I don't know why. It's probably one of those psychological patterns, like those that explain that the food that looks neat on a plate tastes better. In a similar way, PRs that look bad from their cleanliness look worse overall and raise alerts.
Similar to the above point, at least certainly in the psychological aspect, you should avoid doing code formatting changes in your PRs. If you need to do changes in formatting then open a PR solely for that purpose and discuss it with your peers. Formatting changes like replacing tabs with spaces, or replacing spaces with tabs, or changing brackets positioning, or splitting a line into two or two lines into one, all contribute to diverting the attention from the reviewer into topics that are not related with the PR itself.
Guess what changed above? Yes, you guessed it right. Nothing, or almost nothing, or certainly nothing relevant.
Remember, the reviewer is likely already doing a good mental effort to follow up with your PR and trying to understand what you are trying to do. The less you want to do is to add confusion by introducing irrelevant style changes.
Communication is key for software engineers. I cannot say this often enough. And while I totally agree with clean and self-documented code principles there are many times that naming and structure are not enough and the engineer needs to provide a few lines explaining what the code is trying to do or why it was done that way.
Again, you need to put yourself into the skin of the reviewer and think: "If someone gives me this same code to review, would I burst into tears?", "Would I be able to understand this if I had absolutely no context of what is being done?". We probably don't have to go to extremes like "a toddler should understand it" but hopefully you get the idea.
The easier and the more effective we can communicate what we are doing, the quicker the PR will be reviewed and the quicker mistakes will be detected. So actually spending some time trying to communicate on an effective manner will have a huge return one way or the other. Needless to say, after this step the code ends documented, which is a huge bonus when new hires come in. Everyone loves an elegant and nicely documented piece of code.
Most source code management systems like Github allow you to review your own PRs. You can take advantage of this to provide feedback in advance. That is, information going beyond the PR description and that might be contextual to the task itself, or may be specific to certain files, or might be just general context, external comments, history, etc.
For example, comments like "This is the asynchronous task executor approach we discussed in the Start of Sprint the other day", or "The metrics aggregator thread that Joe suggested is implemented on this file", should not be in the code itself and might be too contextual to be on the description. However, adding these as comments can provide important and useful context for the review and we can put them just as our own code review attached either to the top of the PR itself or to individual files.
Some tools understand and can link pull requests with the existing tracking system. For example, in JIRA if you include the ticket id within the PR title then JIRA will auto include all the PR information within the ticket itself. This is extremely handy to track the different tasks and map them with actual engineering work.
Most of the project management tools have some sort of integration with Git, so go ahead and make use of it!
Bonus: If you integrate your source code management tool ( e.g. Github ), with your collaboration tool (e.g. Slack, Webex Teams, Microsoft Teams...), and your PRs are tidy, then you get a wonderful team space containing all the code changes ready for your team to review, or to just be aware of, or just to revisit back after a few months to check something in the past. All for free, as long as your PRs are treated well.
A PR can be a very important source of documentation when created in a responsible manner. It provides a meaningful description of what you are trying to do. It does contain a set of changes that are meant to achieve your task and it includes a set of tests that verify that indeed the requirements are met.
When done right, existing engineers or new hires can go easily through the list of closed PRs for a nicely historical segmented view of the most notable changes that have been introduced gradually into the project by the team or by an individual person over the last few months or years.
However, for being valuable, Pull Requests really need to be as good and tidy as our code is meant to be.
Hopefully, the tips above will help you with that. Thanks for reading.