As I was reading this other post on why documentation is essential, I figured that the way documentation is written in my team is getting good and that's a good time to share it.
When I was still a student I used to be the president of the networking club, which managed the Internet in the students dorm. Successive generations had came up with very innovative solutions with basically no budget yet nothing was documented and a lot of projects just ended up being re-done from scratch. At some point I instigated a "no docs no prod" rule and it worked pretty well! There was a wiki to which we still referred years later, even to find useful tips in our professional lives.
After some time in the docs-void world and now that I have my own team, I've decided to bring that rule back. It takes some time for the people to adjust and most of all it takes someone to enforce the rule. Still, it can realistically be made to work and that's a pretty important point.
The threshold of what should go in the documentation is pretty vague but here's what I believe should compose a good documentation.
First of all, let's consider what the documentation should not be. Did any of you ever learn how to use a library from its Javadoc? Exhaustive documentation of every tiny bit is useless, we're not fucking robots that can absorb everything at once. On the other hand, projects hailed for their excellent documentation are things like Vue.js or Requests, which start with an overview and a gentle walk-through.
It's pretty clear that every project should have a redacted document which serves as an highlight on where to look further to do one specific thing or another.
Now let's not discard the Javadoc entirely. It is useful to have a precise documentation that tells you what a function precisely does. But more than that, and especially in private APIs, you might wonder why is this function there. This way you can take a step back and see the function in the whole system.
So to sum up, we want:
- A global, highlighting and redacted documentation
- A precise reference on what each function does and why it is here
Storing this documentation is another thing. Let's make it as frictionless as possible!
There reference doc is pretty simple. There is a plethora of docstrings, JSDocs, Javadoc and other Doxygens. So let's use this for the reference doc. Personally I don't even bother to generate them as the IDE can read them directly and provide the doc's content with a simple keyboard shortcut.
Source code management tools (GitLab and GitHub at least) now render markdown files so let's make use of that.
All documentation start with a
README.md file at the root of the repo, with approximately those informations:
- Description of the project
- Software dependencies, maybe packages to install on servers
- Required environment variables to make it run
- Cron jobs that have to be scheduled
- Links to feature-specific documentations
And then as you guessed, there is for each feature a specific
.md file which explains how it works as well. Things you can put in those files include
- Explain the various mechanisms of your feature in isolation from the rest of the code base
- Provide the name of the main classes/functions to look at
- List cron jobs and explain what they do
- Explain the data model
- Give example API calls for various tasks
The largest anti-documentation argument is "it's useless because it will eventually grow out-of-date". In fact, it's somehow simple to keep it up to date thanks to a wonderful innovation that landed in most teams now: pull requests.
Since the documentation is tightly coupled with the code (sharing the same commits) it's a pretty simple rule you can setup: no merging of the doc is not on par with the code.
To help with this, I created a pull/merge request template which proposes the following checklist:
- Docstrings on new classes/functions/CSS rules
- Updated docstrings when classes/functions/CSS rules where modified
- Mention of what is being merged in
- Feature-specific documentation in a separate file
- GDPR data collection analysis and documentation in
There is two highlights I'd like to make here:
- Yes, we do document CSS rules. Here more than anywhere there would be no point saying "colors the text in #f00" in the docstring. Instead the doc explains why those rules are here, what problem they are solving, how they relate to other rules (relative positioning and so on) and gives justification on non-obvious choices (aka hacks)
gdpr.ymlfile is our implementation of the GDPR registry. That's a subject for another article however it is useful to highlight that all European companies are in the obligation to maintain this registry. Since the data manipulation is done by the code, it's a good idea to keep it up to date as you change the code.
The good thing with this methodology is also that you can apply it to a completely non-documented project. All you need to do is to write a decent
README.md file which will give you the basics of how to run the project. Then you can just incrementally add features to the documentation as you write them without necessarily writing all of the documentation at once.
Over time, as you develop new features, you will either replace old features either naturally document them because you had to explain them as an introduction to some other feature in your doc.
Let's split down the cost of this methodology:
- Tooling - none, you already of the tools and there is no setup to do
- Writing - honestly I don't know but if I had to give a ballpark estimation let's say 10% of the coding time is spent writing docs
- Improving code - since you write documentation you'll find that some code can be factorized or re-organized more logically and you'll spend maybe 5% of the coding time doing so
Now the benefits, a bit harder to measure:
- Less bugs, since taking a step back at each pull request helps you thinking your code over and reminds you of all the little things you forgot to handle
- Much less time to enter the project for newcomers
- Less WTF reading other people's code, since you know why things are here. And you're not tempted to remove them because you think they are useless (thus breaking a major feature in production without noticing)
- But also you know when you can delete code since, again, you know why it's here
- Which means of course, less technical debt
- And finally you remember the project now but if you stop working on it for a few months I guarantee you're going to forget most of it
Overall, having an up-to-date and meaningful documentation brings you a much less stressful code base. The time you win is hard to estimate but basically the larger the project the bigger are the wins. The trade-off is that you have to spend a linear time writing your documentation to avoid an exponential time loss in technical debt and confused juniors.
So, start early before your project is too big and keep this doc up to date!