Communication is difficult. As remote work becomes more of a thing, as teams work across six+ timezones, and as companies are looking for rapid growth, it is paramount that technical knowledge is able to be communicated correctly and effectively.
Over my past few professional experiences, it became apparent that I had started to spend more of my time communicating ideas and interfaces with others, and less time actually developing them. Now, obviously the design phase before building something is critical, but I was more concerned about the amount of time I spent discussing a feature after finishing it. I found that if I spent some time building a new feature for our product, I would spend at least two or three times the amount of time answering Slack messages for users who wanted to use it or who had questions.
I realized that while we had good technical engineering practices, we were lacking the communication channels & skills to allow our clients to effectively use the product. This resulted in a fair amount of precious engineering time lost to our on-call team member basically acting as tech support on Slack, as they'd have to answer very similar questions multiple times throughout the day.
Our product was still young, but it was quickly growing in popularity at the company. It was a cluster configuration and deployment system that managed many of our internal services. I was one of the lead engineers responsible for developing this ecosystem, and we had built out quite a few systems and solutions to solve all different categories of technical problems.
We were able to implement these problems very well, but we had grow so fast that we started to realize some problems with the way we were communicating and sharing our product with the rest of the company.
As more users found the tooling and started using it, we had more users who ranged from "Oh this is neat", to our advanced users & internal team members who knew the full "in's and out's" of the tools. Somehow, we had to make sure every end user and effected client could effectively use our tooling to solve their deployment problems.
This growth problem was easily manageable when we had just a few teams using the product, as many of those users had been with us since the start of the product and could use it moderately well. We had a page or two of loosely organized Jekyll documents that briefly explained how to get different environments up and running, and the users were expected to sort-of just figure out the rest.
Our documentation at the time was like the Owl-drawing tutorial below; we will walk you through the left-side picture, and then just throw you to the fire and hope you figure out all the details as you go to production.
Tribal knowledge is any unwritten information that is not commonly known by others within a company.
We quickly realized that our team and the original user base had quite a bit of tribal knowledge in the system. As our product grew in popularity and was used by new users, they did not have this knowledge that seemed "trivial" to us, but in reality was confusing for these first-time users. We had no way to effectively communicate the silo of knowledge that we had developed over the years of building & using the product. It's very difficult for people to use something if they don't know how to use it. They are even less likely to use it if they see others describe it as "easy" without seeing the path to attaining that knowledge. When people notice this gap of knowledge between it's users with no discernible way to resolve it, the reputation of the product is at stake.
Our users wanted to learn our product, and we wanted them to love the product, but we had a documentation problem.
Our team set off to close this documentation gap between our users and tooling. We had three main areas of tooling & associated documentation that needed addressed:
- A web user interface that provided a nice overview and data visualization of our system
- A RESTful API that provided all of the deployment data and configurations for the web UI and any integrated services
- A CLI for command-line-based operations which allowed for local development and cluster configuration management
- Our overall concepts and tutorials so that users could learn the system from a high level across the aforementioned tools
It was important for us to solve our documentation problem for each of the services above, because we know that good project documentation allowed the following situations:
- Improved onboarding of new users and new team members
- Reduction in support questions by users
- Ability to quickly link users to documentation sections for answers and concepts
- Increased reputation and usage due to (hopefully) being more user-friendly
Web interfaces are hard, because they should be designed intuitively and cleanly so that users familiar with the interface can interact with it as efficient as possible. This presents a challenge for newer users who are unfamiliar with the system, as the layouts and functions of components will be totally new to them.
To overcome these documentation issues in our web interface, we had both proactive documentation through means of initial popups that would call out certain elements on the first visit to a page type, and passive documentation that would open a sidebar containing page-specific documentation when an icon button was pressed. We felt like this approach was a good trade-off of introducing new users to elements without being too intrusive.
The highlighted element pop-ups used Reactour to display their information. To ensure that we covered future changes to the steps, we would hash the list of generated steps and save them to the client's
localStorage in the form of
[page URL]: <steps hash> where the
[page URL] was of generic form, such as
/users/[user_id] so that we only showed the steps once per dynamically-rendered page type. These pop-ups provided very basic introductory information about the elements that they highlighted, such as what functions they performed.
For our passive, yet detailed documentation, we allowed each page to pass a markdown page source to our core layout component that would lazy load and render the markdown into a nice right-side drawer that could be toggled open and closed. This sidebar would be used for presenting more details about the overall page usage and components that existed.
Our RESTful API was actually the easiest to document, because we were able to rely on third-party libraries from the start. It was a Python Django and Django REST Framework project that leveraged the
drf-yasg OpenAPI generator library to create OpenAPI and Swagger compatible documentation.
This library allowed us to write the API documentation in-place throughout the route definitions, and
drf-yasg would render documentation in either the form of an OpenAPI schema or as an interactive Redoc/Swagger UI web page. This meant that we were generating documentation from our code and there was no extra developer steps to change documentation if a feature or route changed.
Amongst the many neat features of
cobra, we were able to extend the Markdown documentation generation feature which would create rich user documentation for each of the available commands. We would generate this documentation at release time and either deploy the static files as a GitHub Pages site, or integrate them with our generic documentation below. We made sure to always include older release documentation for users who were not on the latest CLI version.
With the individual components laid out, it was finally time to document the entire system so that it made sense to users. We actually jumped through three or four documentation generators over the three years that I worked on the project.
The loose timeline of our documentation services was:
As mentioned earlier, our documentation started in Jekyll, and it worked decently well for us. The two main drawbacks for us was that
- It slowly grew too difficult to manage large documentation as links & page references needed to be manually updated, and it became more difficult to manage as the number of pages grew
- Our team did not use Ruby, so the development process of putting docs together put users on an unfamiliar path.
We also found that some of the documentation templates seemed to be dated, and we were just looking for something that looked more refreshed and clean, but that's more of an subjective reason to switch than a technical one.
We gave Hugo a shot next, and it provided some more validation and speed in terms of development time. Since the associated program is just a binary, it could be very easily installed on our machines and in workflows without much fuss.
Hugo provided more flexibility and validation across our documentation, and we quickly grew our documentation and made it look much more organized using the Hugo Learn theme.
Hugo was great, but it required a Build step to be run between pages in a GitHub repository could show up in a GitHub pages site. We built workflows to handle this on merges to
main, but it wasn't perfect and would occasionally fail. This generation step had us looking at client-side generators again, and we ran into...
Docsify is a client-side only documentation generator. As a user requests a documentation page, it would fetch the associated
.md file and render it all client-side. That meant that our documentation deployment process was simple again, as we could just push changes to the
main branch and BOOM they would show up within seconds.
The other draw of Docsify was that there were some really slick minimalist themes that we found and really enjoyed. We could pair these with the long list of extensions and provide users with clean, yet extensive documentation.
The drawback with Docsify was that as our documentation continued to grow, it started to slow down the rendering process as the initial load would require fetching a lot of information. On top of this, the searching functionality became more unusable as it wasn't a great interface, and it had to search all files in a flat format to generate the results. My final gripe with Docsify was one of the reasons that I was initially drawn to it; anything more than generic markdown requires plugins. This introduced a bunch of documentation dependencies on libraries that didn't seem fully legit and managed, and we found ourselves having to write custom plugins a lot to make things render as desired.
After jumping from documentation generator to generator, we finally settled on Docusaurus. It introduced a build step again similar to Hugo, but we found that the trade off was worth it, as Docusaurus brought a bunch of really nice features with it. Since everything was just React under the hood that could be customized, we naturally gravitated to it as we already had React experience.
We found that it was much easier to customize and extend Docusaurus as compared to our other generators. This meant that we could provide the full & rich experience that we wanted. Our users also responded the best to this iteration of our documentation, but that may partially be accredited to us spending more time to organize the content as well as the overall layout.
The following tips and suggestions are just a few of the ways that our team overcame our documentation debt challenges, and how we ensured that our documentation was always in an acceptable state, no matter the level of user who was reading them.
As previously mentioned, our API and CLI documentation was generated directly from the underlying code. I highly recommend this strategy of Documentation-from-Code; it makes the process of writing great documentation that much easier. Having one interface for developers to add and document features makes it more likely that they will actually write documentation.
Most languages and application types have some set of libraries to assist in generating documentation from code, and I highly recommend implementing them in your user-facing projects.
It's difficult to climb a mountain in a day, but it becomes easier if you take one small step periodically. Documentation works the same way; writing all of your team's documentation at once will most likely cause stress and annoyance. It's strongly encourages to write documentation in small increments, hopefully at the time that the associated feature is written.
For older features that need documentation, don't fret; just try to write a few paragraphs, or even sentence, whenever you have some free time. As long as you are slowly crunching way at the documentation debt, you are improving the situation.
Finally, I found that it was easy to add/improve documentation for a certain feature if a user had just asked a question about the topic. This showed either a gap in our documentation, or a discrepancy that confused users. By tackling these issues one at a time, it was easier to ensure good documentation.
As our product matured, our documentation evolved from a list of individual available features to tutorials and guides based around common user workflows. We had a section of tutorials for our common use cases that included users just getting started with the system, all the way to some of our most advanced use cases.
Being a deployment system, we broke up our documentation into cluster timeline user stories:
- Day 0: Gathering and provisioning infrastructure
- Day 1: Implementing cluster tools such as monitoring and alerting
- Day 2: Deploying application and configuring DNS amongst other things
This layout provided users with a pretty linear approach to reading our documentation. They could either get the brief steps through a tutorial, or they could follow our guides in order to use the system. Our support questions have reduced slightly since we switched to this more linear documentation organization.
It's not too unheard of to hear about engineering teams having hackathons during the workday to allow contributors to work on neat projects and squash technical debt. I would say teams need to go one step farther and have Documentation Hackathons, or at least give individuals the ability to fully devote time to both their client-facing and internal documentation.
This allows some time to "refresh" the content and ensure its fully up to date.
Good documentation practices are key to ensuring happy users and keeping technical support questions to a minimum. Hopefully the above experiences, technologies, and tips help you and your team present your products as positively and completely as possible!