This is one of the great discussions among developers: document or not document your code? Is it worth writing documentation in your code?
I thought this topic was completely overcome and it was clear that, except some few occasions (implementing a public API), documentation was not necessary. Until I saw a code review where the reviewer pointed out the lack of documentation as an issue. Really?
I was one of those who used to document my code... or at least I tried. I was so convinced that code had to be documented. As a backup or reminder for my future myself or any other developer luck enough to end up in my code. Although I always realised it was always out of date. And by then, I already wondered: what is the purpose of documenting the code if the documentation is always outdated?
Until several years ago I read the book Clean Code.
I saw it "crystal clear", there is no need to document your code if you code your documentation.
With this I mean to use meaningful variable and method names. If the name of the member already tells you the information that is keeping and the name of the method tells you what the method is doing you can end up reading the code without the need to figure out or document what your code is doing.
Extract as much code as you can to methods. Even if you end up having a method with only 3 or 4 lines. Each method should do one thing and only one thing. And the name must explain what it does.
Each member of a class must have a name that only reading it you know which information you can find there. Same for variables and input parameters.
Following this simple steps you can have a code you can read, having the documentation right in the same code.
Yes, I know, there are those times you have to implement code with a complex algorithm or you are copying a code you found on Internet which might be complex, you might not understand and you might not extract in simple and meaningful methods. Yes, there are always exception.
What do you think? Do you document or write the documentation in your code?
This post was originally published in Medium
Top comments (34)
Well, I think modern coders have confused the reason we should be documenting and commenting. My policy is pretty simple:
You should always be able to tell WHAT the code is and does by how you write it (your point.)
The WHY (the intent of the code) is rarely something one can figure out in the best of circumstances. This is where commenting comes in - but comment WHY, not WHAT.
Documentation, in the literal sense, should be hand-written for your end users. HOW do you use the library/application? I hate it when projects call API docs "documentation," and don't bother with anything else, as it is nigh impossible to learn to use an unfamiliar library from the API docs.
In short, we MUST do all three: code your documentation (WHAT), comment your code (WHY), and document your project (HOW).
Amen.
The pendulum swings. In college, I was taught to comment everything because "that's what industry will expect." Getting into industry was quite a shock, needless to say.
I heartily agree your code should be readable, and "say what it does."
Sometimes, it needs to do something....odd. The WHY for that should be a comment.
External documentation has its place, particularly for how to get someone else set-up on developing the module, and code examples for idioms.
I was about to write exactly the same post :)
The only thing is that it's extremely complex to teach that to youngsters. Code reviewing doc from juniors takes a lot of trips back and forth in the comments.
I guess that clear a clear doc comes with a clear vision of what you're doing.
I always try to use good naming conventions for variables, functions, classes, etc.., but I still believe in docstrings for almost every function/method/class. It takes almost no time, it can help you remain focused, and it will help others quickly see what is happening in the code. I avoid using comments within a function/method, unless it is unavoidably convoluted.
I completely agree with this. It makes it so much easier to add someone new to a project.
I feel strong opinions about this subject, I've seen the easiness and welcoming warmth that documented code gives to new people in the code base (and by new I don't mean people with no experience).
I use both, both prose and long and clear variable names that try to speak the intention. I can pick up the code rather easily but my experience is that people don't want to be reading code for the most part unless they absolutely have to.
So as a Python dev I felt that self documented code was the best... but after working in a totally unknown code base in a new language to me... not having prose documentation is hellish.
Do the right thing, try to make comments and variables as simple as possible but not any simpler and also keeping in mind that simplicity is hard and expensive to attain.
Hell sometimes prose and code don't even matter if people won't read either. At that point you even have to consider making video tutorials, talks, conferences, whatever you have to do just to get your point across.
Documentation is hard, but not because of the medium. Some people digest code, others prose, others talks, other classes and a mentor.
And our job is to consider every single one, that is if you want your code to survive.
I agree strongly, especially to the prose part. It's probably a thing of preference, but I just love it if I can jump through a codebase, gaining a good idea of what it does by only reading the comments and some really trivial lines of code.
I've seen what happens to comments in code: Nothing. And that's bad. The first coder writes a comment, yeah, great. Then someone makes a quick change to the code... But guess what, the comment doesn't get adjusted. And suddenly the code and the comment are telling two different stories. Happens all the time.
This is one of the reasons I also prefer the clean code approach: Let the code tell the story and restrict the code documentation to the "why". What a method does should be obvious, but the question will be, why it does that. What is the purpose? Why -12 and not -123? etc. The less redundant documentation inside your code, the better.
Of course, if we are talking about a public API or something, a documentation for the user (which can be in the form of many examples) is great to get started, no doubt about that. Also it's not a bad idea to document that structure of your project, the "big picture" to find your way around.
In your experience, do devs who discontinue the comments really have the discipline to continue the "clean code" paradigma?
Writing documentation is a pretty thankless job. Nobody will ever read what you write and no matter how much of it you write people will find something you did not document and complain about it. It's always somebody else that ought to document something. And somehow it is frequently implied that would be me.
So,my attitude to documentation is "by exception". Most code should be obvious and not in need of any documentation. But once in a while you do something non trivial and then it matters to document wtf. you were thinking. A simple link to a stackoverflow post to explain the totally non obvious workaround for some issue, the weird set of circumstances that caused you to add some null check or other condition that you chased down, etc. Those are the things that need documenting. All the rest is obvious.
I love little hints that outline the reasoning behind bits of non obvious hackery, algorithm choices, concurrency handling and other bits of code where somebody really went the extra effort to get things done the right way.
A problem with depending on your function names for documentation is that it makes your code rather verbose. (Anyone who has been a Java for dev for any length of time knows exactly what I'm talking about!) Following the logic of Clean Code, I suppose that
map
should have been namedapplyFunctionToAllElements
or something. Fortunately though that isn't the case, and after reading the documentation you can remember whatmap
means and be spared all that verbosity.I'm not against favoring code over comments though. If it's a private method that has a very specific purpose and isn't called from many places then I'm happy to not give it a comment header. As ever, "it depends".
Whenever I found naming hard, I realised had an incomplete idea of the domain and goal.
While I agree with the general gist of the article, I would live to address this part:
"Extract as much code as you can to methods. Even if you end up having a method with only 3 or 4 lines. Each method should do one thing and only one thing. "
I am not a fan of this part of Clean Code. It is too simplistic on it's advice given - extraction is a good tool, but giving blanket recommendations this way doesn't work.
Let me explain.
I understand where you are coming from, and I did the same for years, but I eventually concluded that my behaviour was leading to worse code, not better. Now, instead, where I used to extract a function, I just tidy it up into a commented block.
In theory, aggressive extraction sounds really good, but pushing for it will lead to premature (i.e. incorrect) abstraction. (Also, since we are talking about a class here, it will also tend to lead to harder-to-track state because it's now spread out over many methods.)
The reason why it leads to premature abstraction is that extracting is really hard to do correctly. It's pretty hard to find the border for what your "one" thing is (and naming it is even harder, sometimes an elegant name obscures), so you are going to fail at doing it a large % of the time. Because the more you extract, the more failed extractions you are going to have, it's important to be careful about extracting and don't use it wildly.
John Carmack, the oculus CTO, writes about a similar thing here:
number-none.com/blow/john_carmack_...
There are a few reasons why I consider comments important:
I feel disappointed in this article, as I thought it would be about writing a program that takes an AST of your code as input and then outputs literate "pseudocode" that a human can read. For example, if you were to write some FizzBuzz code and feed it into the machine, you would then get the following info:
It might not be good documentation, but it would be interesting documentation.
As for your broader point, I tend to lean towards "Readme Driven Development" - writing the documentation first and then writing the code to match your documentation. You start off with your clean, awesome documentation (knowing what your code is supposed to do beforehand)...meaning you can then focus your time on writing the clean code that matches what your documentation promises.
It could also be "bug driven development". Your README is a specification, and you never do anything that is NOT in the documentation first.
I took this approach in my project here: github.com/ScalaWilliam/eventsourc...
Also you can add some automation to this:
For example, if you added a change in README, you can say in the description "New Issue: thing X is not implemented, does not match readme", and a git bot that I made will automatically create an issue based on that. Here's a commit and an issue:
github.com/ScalaWilliam/git-work/c... github.com/ScalaWilliam/git-work/i...
Side note:
The fizz buzz example is that of imperative code, it's really better to read the code in that case. Where I'm thinking some sort of visual would be useful is in generating a code architecture view automatically. Like a dependency tree within your code.
No. While I love some well written code, clear in meaning, it is by no means a substitute for documenting your code. I shouldn't need to dig through what is sometimes hundreds of files and thousands of lines of code just to find out how to do work with a project. It doesn't matter how clear code is, it cannot describe your entire code base, or the far reaching side effects.