I used to think that I didn’t need comments if I wrote self-documenting code. However, I have realized that I do write comments, and that I find them really useful. To see how many comments I write, and what kind they are, I wrote a script to analyze my git commits from the last six years. In total, seven percent of my committed lines contained a comment. This blog post has details on what constitutes good and bad comments, as well as more statistics from my script.
Part of the reason I was skeptical of comments was the prevalence of Javadoc-style comments. This style of commenting exists in other languages as well. Here is an example in Python that I just made up, but which is representative of this style:
The problem with most of these comments is that they convey very little information. Often it is just a repetition of the method name and parameter names in a few more words. These comments may be useful for API:s exposed externally, but in an application where you have access to all the source code, they are mostly useless. If you wonder what the method does, or what the valid input range for a parameter is, you are better off just reading the code to see what it does. These types of comment take up a lot of space without providing much value.
Instead of writing Javadoc comments, you are better off making the best possible use of method names and variable names. Each name you use can help explain what the computation is about and how it is done. One good reason for writing many small methods instead of one large method is that you have more places to use descriptive names, something I have written about here.
Writing self-documenting code will get you a long way. However, there are times when having more information is useful. For example, the comment on the use of dialling zones in the code below:
Here is another example:
Often the advice given is “Comment the Why, not the What“. While this probably covers most of my comments, it is not how I think about when to comment. Instead, I tend to write a comment when there is something particularly tricky, either in the domain or in how the implementation is done.
The standard advice from the “no comments are needed”-crowd (which I used to belong to) is to rewrite the code so you don’t need to comment it. This is however not always possible. Sometimes the domain is just too complex. Sometimes the effort to rewrite the code would be too much compared to adding a comment.
Another complaint about comments is that they will get out of sync with the code, thus hindering your understanding of the code rather than helping it. While this sometimes happens, it has not been a big problem for me. In almost all cases I analyzed, the comments were still valid. They have also been very useful. Every time I came across one of my comments like these, I was happy I wrote it. It doesn’t take long to forget some of the details and nuances of the problem you are solving, and having the comment there with some extra context and explanation has been great.
Sometimes you get a comment “for free” if you are logging an explanatory message. In the example below, the log statement explains what has happened, so there is no need for a comment.
When I first thought about checking how many comments all of my commits contained, I thought it would be enough with a one-liner like this to find the comments in all my Python commits (I only comment using #):
git log --author=Henrik -p|grep '^+[^+]'|grep '#' | wc -l
However, I soon realized that I wanted more details. I wanted to differentiate between end-of-line comments and whole line comments. I also wanted to find out how many “comment blocks” (consecutive lines of comments) I had. I also decided to exclude test files from the analysis. Plus, I want to be sure to exclude any commented-out code that happened to be there (there were unfortunately a few such cases). In the end I wrote a python-script to do the analysis. The input for the script was the output of
git log --author=Henrik -p .
From the output I saw that 1299 out of 17817 added lines of mine contained comments. There were 161 end-of-line comments, and 464 single line comments. The longest comment block was 11 lines, and there were 96 cases of comment blocks that had 3 or more consecutive lines.
I used to think that writing well-named functions would mean no comments were needed. However, looking at what I actually did, I noticed that I tended to add comments in tricky or unintuitive parts of the code. Every time I come back to those parts of the program, I am happy I made the effort to add a comment – they have been very helpful!