Programmers hate writing documentation. Most programmers, anyway. That's just stuff that gets in the way of the Real Work™, right?
The truth is that the Real Work™ of programming mostly consists of things that aren't directly writing code. Determining specs, designing an API, exploring technical limitations, learning best practices, training others, and so on and so on.
Even when a programmer is directly creating code, that code itself must find some middle ground between sometimes-mutually-exclusive goals:
- The code must solve the problem at hand.
- The code must be easy to maintain.
- The code must be efficient enough.
In order to be maintainable, good code must be written for people. Specifically other people. Even if you're a solo developer, your future self will be a different person, at least with respect to the code you're writing now.
One way to deal with this is so-called "self-documenting code". Combining carefully thought-out variable and function names with clean-code practices and industry standards is mostly what this is about.
Self-documenting code is also an excellent practice because it reduces how much documentation is required outside the code (e.g. in comments). Having code in addition to comments that describe that code is a "Don't Repeat Yourself" (DRY) principle violation. The purpose of DRY is to prevent errors caused by changing something in one place without also changing that same thing in another place. If there's only one place, such a mistake isn't possible.
But how do you document bigger-picture stuff? The overarching purpose of a project? Its entry point? Who should be using it? Its dependencies? When it was last updated? What coding standards it's using?
It's a lot harder to keep your project documentation DRY than it is for the lines of code within it. This is where we get into "Documentation as Code" (borrowed from the concept of "Infrastructure as Code"). The goal is to have every piece of documentation somehow coupled to the functionality of the project itself, so that if one changes then both must change.
Wherever possible, anyway.
I admit that this is something I've been historically terrible about, and am trying to find ways to dramatically improve in my projects. I don't have a grand solution, but here are some useful concepts and tools I've been thinking about:
Global conventions. When I say "global" I mean across projects and teams. Adherence to convention is a type of documentation. For example, if everyone knows that event-triggered functions are always prefixed with
onDownload()), then you can have simpler function names without also needing comments. Or if everyone agrees that callback functions will always start with an
errorargument, then everyone's code can take advantage of that without additional documentation.
- Configs everywhere. Tools like "Cosmic Config" make it easy to simultaneously follow general industry practices while also doing things how you prefer to do them. By putting information into parseable, testable configuration files. This brings you into infrastructure-as-code and environment-as-code territory, further reducing documentation needs.
- Automate everything. If a robot does something, a person doesn't need to know anything about how that something works. If you need to do something regularly in a project, turn that thing into code.
Prevent setup errors. All good tools have an
initcommand (or similar) to make it easy to start using that tool with minimum error. The best ones interactively guide the user through decisions they need to make by asking human-friendly questions. Ideally the user would never even need to look at the resulting configuration file(s).
- Docs and Code should have the same dependencies. This is a tricky one, but also the one I'm most excited about. It's reminiscent of the Dependency Inversion Principle. The idea is this: we normally treat documentation as being dependent on the code, but what if we had both depend on something else? That way we could make changes to that something-else and consequently both the code and docs would stay in tune.
For that last item, you would definitely need documentation to be built by code for it to work. A simple example is using configuration files -- the code that builds your docs can read values out of the same config files that your code does. In effect, the more you can abstract concepts into modular code or data, the more it can be used in automated docs and code.
To accomplish this you could use tools like Swagger/OpenAPI, joi,Express Validator, and others.
I've only just started trying to find ways to do this. What tricks and tools do you use?
This article originally appeared in a DevChat Newsletter.
Top comments (26)
Expressive tests make excellent documentation that by definition can't get out of sync with the actual code, regardless of language.
And Elixir, for instance, lets you document each function with code examples that get run along with your unit tests, so that explicit documentation stays in sync with the function code. (See elixir-lang.org/getting-started/mi...)
Actually install jsdoc as a dev dependency (which isn't required for the above paragraph, instead just requiring JSDoc comment syntax), and you get a generator for code docs.
JSDoc is great, and gets at a subset of the general problem.
JSDoc doesn't cover all the functionality that Typescript does for type information, and is quite verbose in comparison. Using both allows Typescript to handle types and JSDoc to handle non-code metadata (explanations, samples), which can all be pulled together with TypeDoc and similar tools.
True. You have to write fewer types just for the sake of other types.
JSDoc is nice until you want to generate the documentation where you then feel like you're using an unmaintained tool; documentation.js is a good alternative although TS removes the need for verbosity and the other pros it gives.
In the following I would add "The code must be written within your time and budget constraints" as this applies to documentation too. Anyway, its the kind of things that you have to fail to appreciate it and only if you are willing to work with people who care about quality in any aspect. Sometimes we let ourselves get trapped in environments that don't care about quality in any form.
The code must solve the problem at hand.
The code must be easy to maintain.
The code must be efficient enough.
I personally think, that documentation is not the last step. If you really document your reasoning to why you solve a task at hand the way you did, then documenting is the key thing to do first.
If you think this to the edge you indeed end with “literate programming”, where you compile documentation into code and then compile that code into the product. This was mentioned in the comments before, but at least I drop a link:
Agreed! I saw somebody refer to this general idea as "Executable Documentation", and another as "Documentation-Driven Development." In that case I've been practicing "Document-Driven Tools-Driven Test-Driven Development" (docs first, then appropriate external tooling, then tests, then code).
This is a topic I have been thinking about and working on for a long time. I‘m a big fan of “living documentation” - documentation that is never outdated because it is generated from the code or other technical artifacts (like the API docs you mentioned).
I’ve created a library for representing use cases as code, including complicated workflows, with the option of generating documentation from it: github.com/bertilmuth/requirements...
I’ve also created a library for generating diagrams from source code: github.com/diagramsascode/diagrams...
You can also run it as a script using JBang. Here’s the gist:
Comments can be repetitive but don't have to be even if they describe code. They could, for example, name an algorithm being used.
Comments on what is being done rather than the how, can also be of use. especially when code is changed for performance reasons.
Yep, definitely! The best comments are those that are least likely to become incorrect when the code changes.
I don't. For me, it's probably the most enjoyable and exciting part of writing software because it usually represents the last step of a software release, which means something's about to head out the door. Also, while writing documentation, I usually am looking at the source code and typically spot at least one or two last-minute mistakes (bugs) that would have made it into the release.
Basically, if you aren't writing documentation for your software, you are missing out on supplying your customers with good documentation such that they will constantly bother you by opening tickets/issues (i.e. waste your time) AND your software will have more bugs in it. That's right, good documentation = fewer open issues on issue trackers. For me, most issues that get opened are actually documentation bugs, not software bugs (e.g. not enough code samples such that users try to use my software incorrectly and get lost/confused).
I absolutely agree. Documentation is a lot like testing: at first it feels like it's in the way of "productivity", but once you've had a project in use for more than a few weeks you quickly learn how much time you lose later by not front-loading tests and docs.
I also used to write the docs last, but eventually I started writing the docs first, then the tests, and finally the code. That's helped a lot to prevent me from creating features I didn't actually need, and from having to redesign APIs after realizing I hadn't covered all use cases.
So, disclaimer - I work for Swimm, and we're doing a lot of innovating around the coupling of documentation to the code it supports. None of these concepts are new, but tooling around them that ultimately resembles a deliberate system instead of something haphazardly cobbled together is sorely lacking. There's also quite a bit of variance in disciplines based on what you're building.
In many modern systems, documentation can actually be the symphony that weaves a lot of robots together. For instance, if you're using Kubernetes, you can generate ingress controller and service mesh mappings directly from OAPI/Swagger docs that literally "just work". So I think we're going to meet some very interesting use cases coming from the gitops / infrastructure as code family in the coming years which will likely transcend into the way that we handle docs that depend on more human interactions.
What I like about what we're building at Swimm is that we couple docs to snippets within the documentation itself and can verify that all documentation is relevant / up-to-date with the current state of the code at the CI/CD level, while we try our best to fix anything that can be automatically reconciled to reduce the maintenance load. Orgs can either block PRs if there are documentation issues, or just open pull requests and everyone agrees they get settled at some determined point in the future that is a bit more specific than "later."
What this encourages is exactly what you're talking about toward the end of the middle of your post - that narrative that explains the rationale, the "why", the dragons that await foolish travelers who don't heed the warnings, and even the directions something might take.
I feel like we're going to be talking about this a lot more in the years to come as more and more organizations migrate to microservices, where you can essentially "onboard" 200+ times during a two year tenure depending on how many services there are, and there's soooo much more you actually need to know than what works conveniently in comments.
Great write! Have a look at literate programming. Also, every business decision leading to code should be documented in such a way that the dependency is clear and ready to be found when the business decision changes.
This post talks about automatically generating markdown / uml diagrams from your existing tests. freecodecamp.org/news/how-to-creat...
Not release yet but grafcet.online is a meta drawing tool meant first to learn or teach programming but also to do visual code design : there's no separation between code and documentation, the mantra being "learning is modeling and modeling is learning". It is agnostic as it copes with any programming language (mainstream and not mainstream past or future), encompasses state machine as it is based on Grafcet which is a synthesis of petrinet and state machine used traditional in automation industry but I expand it now, but it is in a familiar form so that it more ressembles scratch than uml state machine :) Above it is scalable and fractal anything visual tend not to be scalable, I do aim for scalability. In fact I'm now eating my own dog food, with 25000 lines of js code for now.
Great post. Thank you for sharing
Gherkin provide a great "framework" for writing Documents as Code ;)
English is not my native language, so the world "testing" confused me when writing automated test, I away think tests are "documents that can check their own correctness".
Outstanding craftwork learned few things, do share as you explore.
I have a strong preference for RestDoc+Asciidoc rather Swagger, it’s integrated with tests while documents the API
And as mentioned before, good unit tests works very well as a live/integrated documentation
edit: it’s for Java/Spring
Someone knows some doc/test tool that can track live code for events in even-driven architecture?