loading...
Cover image for To Comment Or Not To Comment?

To Comment Or Not To Comment?

codemouse92 profile image Jason C. McDonald Updated on ใƒป12 min read

Commenting. It's one of the more controversial points of code style. There's no lack of opinions on the topic, but there's very little uncertainty about the validity of those opinions.

I've been an advocate for formal commenting practices for a good portion of my career, even authoring the Commenting Showing Intent standard. I've heard countless arguments for and against comments, and no end of disastrous commenting policies from both camps. I've even had a few misadventures of my own, bringing a few hard-won lessons.

Whether you believe in commenting everything or nothing, sparsely or densely, just doc-comments or the whole enchilada, I think you'll find some interesting perspectives herein.

The Purpose of Comments

Before we take another step, I want to get everyone on the same page regarding why we comment: comments should reflect intention.

Contrived examples make it hard to show this, because real software is exponentially more complex than any code snippet we can cook up. (Believe me, I've tried to come up with examples for years.) So, you're going to have to imagine this along with me:

Key Point: Any comment should describe what you intend the goal or outcome of a section of code to be. This is true, regardless of the frequency or density of your comments. Comments should never restate the functionality of the code.

There's a lot of subtlety here. I'd consider "read all the lines from the file" to be a reasonably useful intent-comment, at least in most languages. But how about "write all the lines to the file"? In many languages, such as Python, that's practically a restatement of functionality, yet it is also the intent.

Worse yet, how about "print the results of the calculation"? In any language even as human-readable as C or FORTRAN, that's painfully obvious for a one-liner. However, given a forty-six line block of string formatting and print statements, that might be useful.

I believe this is why Commenting Showing Intent has been as useful as its adopters have reported...

[The goal is to allow] for the complete rewriting of a program in any language, given only the comments. (emphasis mine)

Key Point: I believe a good litmus test is to imagine removing everything but the comments. Do you have enough information to completely rewrite the program? Furthermore, are the comments language-agnostic that they'd be equally useful if you switched from, say, Python to Java?

If the comments are too vague to recreate the program, you didn't comment enough. If the comments are too language-specific, you've merely restated the code, not the intent.

To put that another way, comments should be a living form of your specification! We all know we should have a spec, but very few of us actually bother to write one. You can look at intent commenting as a means of writing your spec while coding.

Advantages to Intent Commenting

In practice, intent commenting has several advantages. These aren't hypothetical by any means; I've personally observed these, and many have been echoed by adopters of Commenting Showing Intent.

  1. The time to read and understand code is exponentially reduced. Because you know what is supposed to happen, it provides you a basis for interpreting that section of code. (This can come in handy when you are trying to get back on track after an ill-timed interruption.)

  2. Finding a specific, functional section of code becomes easier. Ever tried to find the code for Feature X in a large, unfamiliar project? Not only do you have to wade through hundreds or thousands of files, you need to spend considerable time reading them (see #1); search tools only help if you can guess the function name(s). Intent comments give you footholds.

  3. You can reconstruct developer thought patterns. We can waste considerable time asking "what the heck did she/he/I mean to do here?" A well-placed intent comment answers that question.

  4. Mismatches between commented intent and actual code outcome highlights potential bugs. We've caught many bugs in code review purely because the intent comment mismatched what the code was really doing.

  5. Intent commenting unfamiliar code speeds up grokking. In practice, if I'm learning, refactoring, or cleaning a code base, I will start by intent-commenting the whole thing. By starting here, I ensure complete understanding of the code, and often catch many inefficiencies. I also learn a lot in the process!

  6. Non-programmers are empowered to be involved. Your supervisor, client, designer, open source user, whomever, will be able to skim through your comments and actually understand more of what's going on. Because intent-comments are language-agnostic, they provide a basis for actual, useful feedback! (Results may vary, depending on the individual.)

  7. The code automatically becomes a training tool. I often direct interns to read specific intent-commented sections of the code. This allows them to study even advanced and highly complicated patterns and algorithms in the wild. I've often been told what an invaluable learning experience this is.

I understand there are many concerns surrounding commenting, especially something as seemingly extreme as intent commenting! Let's take a look at a few now.

The Myth of Fully Self-Commenting Code

"Your C++ is showing, Mr. McDonald," you say. "Comments may be well and good in clunky, esoteric languages like that, but [language] is self-commenting! Comments are just line noise."

I've heard almost exactly that among Python developers. Don't forget, I spend considerable time in Python, and I love how clear and obvious the code is.

Yet, no matter how English-like the syntax is, code can never actually be fully self-commenting, because it can never capture your intention.

In reading Python, I can always tell that you're iterating through the lines of a file and extracting a particular substring. What I can't necessarily figure out is why.

"But it's so obvious! I'm extracting the email address from a contact book CSV."

Sure, it's obvious to you, but that makes several errant assumptions:

  • You assume I know you have a need for that information.

  • You assume I'm reading the code in stack-execution order.

  • You assume I'm oriented within your code base, and not just dropped in at a random location; ergo, I'm not looking for something in unfamiliar territory.

  • You assume I possess identical technical knowledge to you.

It's practically impossible to avoid making these assumptions, because they flow from a typical gestalt of human psychology: we assume others share our experience.

Truth is, you may well be the confused one in your own code six months later. Unless you possess a nearly eidedic memory, you won't remember every detail of your own thought process. For that matter, you may not remember in six minutes, after your co-worker interrupts your workflow to ask if he can borrow your stapler.

If you doubt what I'm saying is true, try this:

  1. Go find a random, active, production-quality project on GitHub, in your favorite self-commenting language. Make sure it's unfamiliar to you.

  2. Open a random file of code deep in the structure.

  3. Time yourself on how long it takes you to completely grok what's happening in the first three functions. Could you write a clean-room implementation such that it would pass any tests the original functions could?

You can repeat that exercise as many times as you like. No matter how experienced you are in any given language, it is difficult, if not impossible, to completely recreate intent from syntax alone.

Once again, comments should be declaring intent, not just restating the code itself. This is the critical difference.

I'll be coming back to this topic shortly, because self-commenting code is an important concept which is distinct from intent-commenting!

No Comment?

There are some developers who ardently advocate commenting as little as possible. I've talked to many over the years, and they all make the same points. Trust me when I say, their concerns aren't unfounded in the slightest; rather, I think they've misidentified very real problems.

Concern 1: Outdated Comments

"Comments fall out of sync with the code," say most detractors.

I'd argue that comments only fall out of sync if you let them. The exact same complaint could be made about documentation, but we don't; updating documentation is expected.

My software company has been enforcing commenting standards for years, and we've never had this problem for one reason only: coders are expected to update comments to reflect code changes. It won't pass code review otherwise.

I don't want to sound rude to anybody, but if comments are falling out of sync with the code, that only indicates developer laziness. It's the same reason we don't write comments to begin with. We figure that, since we understand it right now, that's all that matters.

If your team has this problem, you need to address the quality standards they have for their work. If you're working alone and have this problem, you need to build self-discipline. It's no different from style compliance, responsible UI design, or writing good commit messages: it requires effort, but the discipline pays off.

Concern 2: Line Noise

"Comments add length to the code file," some complain. "I'd rather have 100 lines of Ruby without comments than 200 lines with.ยจ

By itself, length should never be a factor in coding practice. Obsession with keeping line counts low is at fault for over-complicated list comprehensions, cerebral ternary expressions, and semicolon abuse.

Yet, as I said, there is a valid concern lurking herein: comments should add value. As I said before, every comment should describe intent, not just restate the code. This is a balancing act, and everyone will undoubtedly miss the mark sooner or later, but remember...comments can be refactored just as well as code can!

When first commenting your code, your top priority should be overcoming your "obvious" gestalt. I literally recommend over-commenting on the first pass, because you are in no way qualified to distinguish between "obvious" and "non-obvious" the first time through.

We could actually apply an old industry adage to this situation...

"Premature optimization is the root of all evil." -Donald Knuth

Once the code is written, commented, tested, and functional, let it "age". Move onto other things. Give your brain time to distance itself from your original intent, or else find another developer who hasn't yet read your code, and then...

Clean up! Re-read the code with the intention of refactoring the comments. Lean towards restating comments more than removing them; keep them intent-focused and language-agnostic. If a comment is unsalvageably obvious, delete it.

Remember, your goal is always a set of comments that allow one to completely recreate the code in another language. It's a living specification.

Concern 3: Time

"Writing all those comments wastes a tremendous amount of time," many argue. "I'd rather focus on adding value to the project."

Once again, this is a common glitch in human psychology: instant gratification. We worry so much about "losing" time, preferring to feel like we're making progress, that we skip important things. Our modern "get it now" culture has only made this worse. It's short-term thinking at its worst.

Productivity experts all agree that taking extra time on positive habits actually saves time in the long run, things like making a prioritized to-do list, sitting down for a real breakfast, exercising, or de-cluttering our workspace. If we're not in the habit, it can feel strange, even counter-productive, to spend some of our valuable, scarce time on anything "not work."

Commenting really isn't any different. By taking the extra time to write down my intentions as I code, I actually see several productivity gains versus when I don't:

  • By offloading my working memory into comments, I lose track of what I'm doing far less often.

  • I catch potential bugs and inefficiencies more often while coding.

  • I can be away from code for several days, weeks, or months, and jump back in where I left off after only a couple minutes of reviewing my comments. Without those comments, it can take me an hour or more; sometimes, I even have to scrap everything and start over!

  • Code reviews take less time away from other people, and they zero in on far more errors and potential improvements.

  • I waste less time explaining my code to others; colleagues, students, and non-coders alike can figure out what's going on without my having to be directly involved every step of the way.

In practice, intent-commenting saves me considerably more time than it takes! It's a true investment in the future: an extra 10-20 minutes now saves me literal hours later.

Information Overload

For this to work, we need to be deliberate about keeping actual line noise down!

Don't Restate Code

First, let me stress yet again: comments are ONLY FOR STATING INTENT. I cannot repeat this enough times! You should NEVER be restating the actual code in comments! I never ever want to see this...

# Print out 'hello, <user>'
print(f"Hello, {user}!")

Depending on the nature of the surrounding code, this may be helpful, however:

# Greet the user.
print(f"Hello, {user}!")

I can actually do something with that information now, even if it's only searching for the greeting. (Once again, contrived examples are terrible for demonstrating this. Roll with it.)

Don't "Comment Out"

Second, never commit "commented-out" code. This is true line noise, and it makes it harder to visually parse out intent-comments. As it is, you should be using a VCS such as Git, and that already gives you a means of seeing earlier version of your code.

The Commenting Showing Intent standard does define a formal syntax for "commented-out" code, which serves to help distinguish between that and real comments. This is only supposed to be temporary (pre-commit), however. Before you commit, remove all "commented-out" code.

Respect Documentation Comments

Documentation comments should be part of intent commenting, not a separate standard altogether. Most modern languages have formal syntax for documentation comments, and you should employ these as part of your intent-comment practices.

A good documentation comment should state intent anyway! Too many API docs require the reader to actually understand the implementation, which defeats the entire concept.

By applying intent-commenting to my documentation comments, I've even made my own IDE experience that much nicer!

As a side note, please remember that API docs do not replace end-user docs. Most API documentation is like teaching someone how to use a toaster by explaining the electrical specifications of the heating element. Write real documentation for your users, please.

Write Readable Code

Your code should still clearly present its own "what". Intent commenting is not an excuse to write esoteric, over-complicated code. In other words, you should still write self-commenting code!

As I said before, self-commenting code and intent-commenting are distinct concepts, and they're both important. Paired with one another, the result is incredible!

Even while you intent-comment your code, the following rules still apply:

  • Variables and functions should have clear names that reflect their purpose. Relevant documentation intent-comments should only expand on this. foo and t are terrible names, no matter how much you comment.

  • Don't confuse complexity for elegance. Just because you can write an awe-inspiring list comprehension or ternary statement doesn't mean you should. No amount of commenting can redeem spaghetti code.

  • Don't let your line lengths run away. Horizontal scrollbars are inexcusable atrocities.

  • Use consistent good style. I really don't care what it is; just pick a coding style, and stick to it.

  • Employ principles like DRY and SOLID in a way that improves, rather than detracts from, your code's readability. There is such a thing as DRY spaghetti.

Final Thoughts

I hope I've made a clear case for intent commenting. The Commenting Showing Intent standard is a great place to start, but it certainly isn't the only valid approach to this.

Let's recap the key concepts:

  • Comments should describe the intended outcome of the code in a language-agnostic fashion. It should not restate the code itself.

  • One should be able to completely re-implement the code in any language, given only the comments. In other words, your intent comments are a living specification!

  • Self-commenting code and intent commenting are distinct concepts which should be employed together. One cannot replace the other.

  • Comments only fall "out of sync" with the code if you let them. Code review policies can be used to guard against this!

  • When first commenting, lean towards "over-commenting", and refactor the comments at a later time. You won't be in a position to discern between "obvious" and "non-obvious" until you've had meaningful time away from the code. ("Premature optimization is the root of all evil.")

  • Don't commit "commented-out" code; it adds line noise, and makes it harder to visually parse out real comments.

  • Employ intent commenting in your documentation comments.

  • Intent commenting is a time investment which improves productivity and code quality, and which reduces the time it takes a developer to acclimate to the code.

As I've said, I've actually employed intent commenting myself and with my staff. I can personally vouch for the payoff, and I've heard from other adopters of the Commenting Showing Intent standard who echo my observations.

Comment your code. You'll thank yourself later.

Posted on by:

Discussion

pic
Editor guide
 

There's the old saw that you should "code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live". That is, don't make him angry by writing really clever code that he can't understand.

Well, comments are part of code. So don't comment in such a way as to piss him off either. So don't assume he's an idiot, that will just annoy him, and don't assume he's really clever and miss out important information that he will need, and for which he will be willing to kneecap your children.

Alternatively, if you find it hard to get into the mindset of a violent psychopath, comment for an audience of yourself on the day you were first exposed to the project you're working on.

 

I comment for (1) the day when I'll come to work with a foggy brain and have to check my code to debug or reverse-benchmark something. (2) that he/she will cover for me

 

Very good way of thinking about it.

 

One should be able to completely re-implement the code in any language, given only the comments.

Replace comments by functions' names and Id be on the same page.

Don't let your line lengths run away. Horizontal scrollbars are inexcusable atrocities.

What do you suggest? 80?

 

The exact line length really depends on things like language, editor, etc. Modern screens afford us more room to work with than back when "80" was the standard, even if your editor is only taking up half the window. I find 120 is usually good to start.

Also, as I said in the article, don't confuse the purpose of self-commenting code (function names being part of that) with comments. They work together, but one doesn't replace the other. Attempting to pack that much information just into function names results in those 60+-character atrocities Facebook source code is infamous for, and at that point, readability went out the window.

 

Well, if you really need 60+ characters to describe the intent of your function, you should try to split it up. Yes, sometimes it is not possible, and THIS is one of the rare cases where I would use a comment.
Functions like "write(all_lines, output_file)) or extract_email(csv_contact_book) have all the advantages of commenting the intent, without the disadvantages. Even in C++.

I don't know that those fully imply their intent. Is your write() just writing out the text to a text-based file? (That might be one you could get away with...but then, you almost definitely won't be writing this function yourself anyway.)

More unclear, are you extracting a single email address from csv_contact_book, or are there criteria? Are you extracting a list of all the emails, or just those that match some specified or arbitrary filter? All of that is unstated and assumed here, and explicit is always better than implicit. You'd need extract_all_emails() just to get around that part, and already we're getting longer function names. Add two or three more criteria to the functionality - as production code almost certainly would - and you get one_of_those_horrendously_long_function_names_just_shoot_me_now().

Self-commenting code like this is good, even vital, but in my experience, it's rarely sufficient to describe the whole intent. It looks obvious to you, since you just wrote the function (or made up the examples), but not necessarily to another reader/coder/future self. And of course, real code is always a lot more snarly than comfy little examples. :)

This is why I recommend writing the intent-comment anyway. If someone else reading your code for the first time can objectively say "this comment is redundant", then and only then is it safe to refactor it out. One cannot reliably tell what's obvious and what isn't in their own fresh code.

As to the self-commenting code, do it. Never send a comment to do a name's job. But conversely, don't send a name to do a comment's job. The name is the WHAT, the comment is the WHY. Neither is qualified to compensate for the lack of the other.

 

And what people seems to forget is: people don't like to write more than necessary. So if someone write a comment s(he) need it or thinks s(he) will, nobody do it to do harm. I like commenting a lot and before commit I check the comments and start trying to remove them, usually shows me that a variable have the wrong name, something is doing something it shouldn't, some language concept is not clear enough to me yet, etc. After this cleanup I naturally endup with very few if any comments and those where the ones I couldn't explain with the code, maybe because I don't understand something yet but all you can do is the best you can do. Being perfect would be very boring.