Stefan 🚀

Posted on Aug 16, 2022 • Originally published at wundergraph.com

How to improve your markdown-based docs with automatic tagging

#markdown #webdev #startup #programming

We've been working hard recently to rebuild our entire landing page and the [documentation website (https://docs.wundergraph.com). I've realised that I kept adding internal links to the glossary when I was using WunderGraph specific terms. At some point I realised that there's actually a better technique to do this, it's called tagging!

So far, the results look promising. Bounce rate is down 4%, time on page almost doubled from 2min 11s to 4min 11s. The Number of pages visited per unique visitor is up from 5.04 to 8.12, an improvement of 61%.

Tagging is a simple technique to turn specific words or phrases into links. I'm super lazy, so after the first couple of links I've manually added, I thought about doing this in a more automated way.

Initially, I thought I was missing just a few links here and there. But after applying this technique, I found that I was missing links in almost all places.

So, doing this manually is really not an option and would also be very time-consuming. Not to mention that you'd have to change links in all places if you'd like to change a URL.

Tagging is actually well known in the news industry

I've first learned about tagging when I was working in the news industry a few years ago. It's common to automatically apply links to specific keywords to keep the reader on the website for some more time.

With docs, my goal is not to keep the reader on the website for too long. What I'd like to achieve is being able to write about a specific topic without having to explain every concept from scratch.

This means, if I can apply automatic tagging, I can be sure that the reader will be able to find the relevant information in the docs.

An Example: WunderGraph's namespacing and Virtual Graph feature

Let's have a look at an example:

The docs page on API Composition & Integration.

This page explains how you can use WunderGraph to compose and integrate APIs. One important concept to support this is to understand the concept of namespacing. Simply put, namespacing is a way to compose multiple APIs into a Virtual Graph without naming collisions. Now I have to explain Virtual Graph, and you know where this is going...

But with tagging, you can see that both namespacing and Virtual Graph automatically links to the relevant page.

What I didn't expect when I wrote this page initially was that each of the supported Data Sources, such as Apollo Federation, OpenAPI, PostgreSQL etc. automatically link to the page, explaining how these Data Sources work.

WunderGraph Cloud is coming soon. You can join the waitlist now and help influence our roadmap, get early access as a beta tester, and meet the product team. Use this link to join the list!

How to implement tagging with markdoc.io

Luckily, our docs are part of our monorepo which is already open source, so I can easily share all the code with you.

We've recently made the decision to open source our docs and add them to our monorepo. This means, we're able to use the same PR to add a new feature and document it right away.

So, how does it work? As mentioned earlier, we're using markdoc.io to convert markdoc files into HTML, but this technique might also work with other frameworks.

Step 1: Create a list of tags

const tags = {
    'Getting Started Guide': '/getting-started',
    'Getting Started': '/getting-started',
    'getting started': '/getting-started',
    'WunderGraph SDK': '/docs/components-of-wundergraph/wundergraph-sdk',
    'TypeScript SDK': '/docs/components-of-wundergraph/wundergraph-sdk',
    'WunderGraph CLI': '/docs/components-of-wundergraph/wunderctl',
    'CLI': '/docs/components-of-wundergraph/wunderctl',
    'wunderctl': '/docs/components-of-wundergraph/wunderctl',
};

We're using very simple string matching, so don't expect too much. It's a good enough solution for the moment. We'll discuss later how to improve it.

That said, there's something important to note here. As you can see, the first tag Getting Started Guide is actually enclosing the second one (Getting Started). We're using simple string matching, so you have to make sure that if patterns overlap, the longer one should be higher up in the list, otherwise the AST will already be rewritten before the longer tag could match.

We've done the same thing with the CLI tag. The tag WunderGraph CLI needs to be before the tag CLI.

Step 2: transform the markdown content and add links

Once our tags are defined, we need to transform the markdown content. The code is annotated with comments to help you understand it.

We're using the transform api of markdoc. Before an AST node is passed to the render function, we're able to use the transform function to modify the AST.

const nodes = {
    paragraph: {
        transform: (node, config) => {
            const attributes = node.transformAttributes(config)
            const children = node.transformChildren(config)
            while (true) {
                let tagMatch = ''
                const i = children.findIndex((child) => {
                    if (typeof child !== 'string') {
                        // some children are not strings, ignore them
                        return false
                    }
                    // find the first matching tag in the string
                    return Object.keys(tags).find((tag) => {
                        tagMatch = tag
                        return child.match(tag)
                    })
                })
                if (i === -1) {
                    // if we didn't find a tag, we're done
                    break
                }
                const original = children[i] // get the original string
                const parts = original.split(tagMatch) // split the string into two parts
                const transformed = [
                    parts[0], // the part before the matching tag
                    new Tag( // add a Link tag in the middle
                        'Link', // new Tag is the syntax used in markdoc to add a Tag to the AST
                        {
                            href: tags[tagMatch], // get the link from our tags list and set it as the href
                        },
                        [tagMatch]
                    ),
                    parts[1], // the part after the matching tag
                ]
                children.splice(i, 1, ...transformed) // replace the original string with the transformed AST nodes
            }
            return new Tag('p', attributes, children) // return the transformed AST node with the new children
        },
    },
    code: {
        // we'd also like to apply the same transformation to code blocks,
        // but only on full matches, not partial ones
        transform: (node, config) => {
            const content = node.attributes.content // extract the content of the code block
            const codeTag = new Tag('code', node.attributes, [content]) // create a new code tag with the content
            const match = Object.keys(tags).find((tag) => tag === content) // find a matching tag
            if (match) {
                // if we found a matching tag, we wrap the code tag in a link
                return new Tag(
                    'Link',
                    {
                        href: tags[content], // get the link from our tags list and set it as the href
                    },
                    [codeTag] // passing the code tag as the only child
                )
            }
            return codeTag // if we didn't find a matching tag, we return the code tag as is
        },
    }
}

Step 3: Putting it all together

What's left is to wire up the transform function and pass it to the markdoc config. We're using Next.js with markdoc,
which looks automatically into the markdoc directory and picks up your config.

You can find the final solution here.

If you want to try it out, clone the repo and run make and then make docs.

Possible improvements

After implementing this, I've noticed a few things that could be improved.

One some pages, there's simply too many links, and they are sometimes repeating themselves.

E.g. the 7th link about the Virutal Graph is not as helpful as the first one. So ideally, we could limit the number of links to a reasonable number.

Next, the general amount of links is sometimes too much.
So, in addition to limiting the number of links to the same page, there should also be a way to limit the link frequency to not overwhelm the user.

Finally, it doesn't really make sense to link to namespacing, when you're already on the namespacing page.

How could this be improved?

Currently, we're simply string matching the tags,
but we could actually pass the whole AST to the transform function and do some kind of analysis on it.

Conclusion

There's a question of what is good enough. I think we're already quite ok.

The initial goal was to save myself time from manually adding links to the docs. This goal was achieved, and I hope it helps some readers to better understand the concepts of WunderGraph.

I hope you've found this useful. Feel free to add this technique to your own docs to improve them. You can also just steal our docs from the monorepo. Maybe consider using some different styles. ;-)

By the way, if you're interested in working with people who care about open source and awesome docs,
please join our Discord and leave a note. We're looking for Developers experienced with TypeScript, Golang and GraphQL, but our main focus is attitude and cultural fit.

DEV Community

How to improve your markdown-based docs with automatic tagging

Tagging is actually well known in the news industry

An Example: WunderGraph's namespacing and Virtual Graph feature

How to implement tagging with markdoc.io

Step 1: Create a list of tags

Step 2: transform the markdown content and add links

Step 3: Putting it all together

Possible improvements

Conclusion

Top comments (0)

Read next

How to add two numbers in JavaScript without using the "+" operator?

How to build a web application from scratch with no experience

Best Free Tailwind Landing Page Templates

What is an Abstract Syntax Tree in Programming?