DEV Community

loading...
Cover image for Multilingual Markdown Documentations and Posts in Seconds

Multilingual Markdown Documentations and Posts in Seconds

Denis Augsburger
I love to create software for people
Originally published at simpleen.io ・5 min read

The translation of Markdown files is commonly needed in technical documentations and headless content management systems, where you want to reach a target audience that speaks different languages. I'm gonna show you how you can translate Markdown easy and fast without compromising on quality. If you'd like to get a head start and try out the Markdown Translator just sign up.

Markdown Translation Tool

More and more tools use Markdown to structure their content. Some examples are:

  • Docusaurus, Gitbook for documentations
  • Hugo, Jekyll, GatsbyJS as static site generators (SSG)
  • Contentful, Strapi, SquareSpace as content management systems (CMS)

Depending on the project, it is necessary to generate multilingual content and update it regularly. The traditional translation process can be time-consuming and waiting on (human) translations can block your release cycles. Therefore we were looking for a fast and reliable solution.

Common Challenges

We've tried out several translation tools and inserted Markdown but we were not satisfied with the translation results they provided.
Common problems we encountered:

  • Broken Markdown Syntax
  • Translation of things that should not be translated, like Code Snippets, Emoji's
  • Different styles in translation results
  • Setup/Installion necessary

Let's take a look at how a simple Markdown file is translated from English to German if you use it directly in DeepL or Google Translate and compare it to the Simpleen Markdown Translator

The file contains a list, some emoji's and headers. Dev.to doesn't support fenced code within their blog posts, therefore the tripple ticks are substituted by html code tags.

## Setup

Install the CLI to **translate** files from source to target path.

<code language="shell">
yarn add simpleen
yarn run simpleen init
</code>

You can search for files in `./blog/posts/en/*.md` and translate them to `./blog/posts/$locale/$FILE.md`.

## Additional support :smile:

- PO-Files
- JSON
- YAML

Enter fullscreen mode Exit fullscreen mode

DeepL Markdown Example

With DeepL the result looks like the following.

## Einrichtung

Installieren Sie die CLI zum **Übersetzen** von Dateien vom Quell- in den Zielpfad.

<code language="Shell">
yarn add simpleen
yarn run simpleen init
</code>

Sie können nach Dateien in `./blog/posts/de/*.md` suchen und sie in `./blog/posts/$locale/$FILE.md` übersetzen.

## Zusätzliche Unterstützung :smile:

- PO-Dateien
- JSON
- YAML

Enter fullscreen mode Exit fullscreen mode

As you can see the code snippet is broken because the fenced code block now starts with two backticks instead of three. Also, the name of the language Shell is now upper-cased. The list, the emoji and the paths are handled correctly in this simple case. The bold text is also marked correctly.

Google Translate Markdown Example

Let's compare this with Google Translate:

## Einrichten

Installieren Sie die CLI, um Dateien von der Quelle in den Zielpfad zu übersetzen.

<brokencode language="shell">
Garn hinzufügen einfach
Garn laufen einfach init
</brokencode>

Sie können nach Dateien in "./blog/posts/en / *. Md" suchen und diese in ". / Blog / posts / $ locale / $ FILE.md" übersetzen.

## Zusätzliche Unterstützung: smile:

- PO-Dateien
- JSON
- YAML

Enter fullscreen mode Exit fullscreen mode

The result with Google Translate is worse than DeepL. The code snippet is broken because the backticks of the code snippet are seperated with a space. Also, the content is translated with is not desirable. The paths are splitted and differently marked. The Emoji is also broken.

Simpleen Markdown Translator

Let's see how Simpleen handles this Markdown example in comparison to DeepL and Google Translate (in this case with DeepL).

## Einrichtung

Installieren Sie die CLI, um Dateien vom Quell- in den Zielpfad **zu übersetzen**.

<code language="shell">
yarn add simpleen
yarn run simpleen init
</code language="shell">

Sie können nach Dateien in `./blog/posts/de/*.md` suchen und sie in `./blog/posts/$locale/$FILE.md` übersetzen.

## Zusätzliche Unterstützung :smile:

- PO-Dateien
- JSON
- YAML

Enter fullscreen mode Exit fullscreen mode

Because we love Markdown we wanted to deliver better and more consistent results with an online translator that let's you translate Markdown into many languages.

Simpleen provides better results because we handle Markdown differently than other services. Instead of just handling Markdown as Text or convert it to HTML, which is supported by most MT services, we go deeper to understand the whole document structure of your Markdown files.

Furthermore, Simpleen understands the most common Markdown extensions and flavors and applies the provided styles from your file to the translation result. For example, if you use two spaces at the end of a line to break a line, we also use two spaces in the translated result.

Supported Flavors & Extensions

Markdown comes in different flavors, and therefore supports different syntax to write your documentations, blog posts and more.
The most common flavors that are used and supported for translations by Simpleen are:

with the following extensions:

  • Emoji's (😄 or 👍)
  • Footnotes (partial)
  • Frontmatter
  • Math

CommonMark is a Markdown flavor that many frameworks and libraries support or build upon, for example GatsbyJS with their remark transformer. Also many headless content management system do support CommonMark.

Better Style Support

There are different valid ways to mark your headers, bold text, lists and more. Simpleen detects your style and reproduces the translated
Markdown file in a consistent way. For example if you use a dash for your lists

My shopping list:

- Dictionary
- Paper
- Pencil
Enter fullscreen mode Exit fullscreen mode

then this Markdown example is translated to German like this:

Meine Einkaufsliste:

- Wörterbuch
- Papier
- Bleistift
Enter fullscreen mode Exit fullscreen mode

Or if you use a star for your list instead it's getting translated to:

Meine Einkaufsliste:

* Wörterbuch
* Papier
* Bleistift
Enter fullscreen mode Exit fullscreen mode

Both results are valid in most Markdown flavors, but we want to consistently apply the styles from the provided Markdown file.
As a result you can use the translated Markdown file directly in your Markdown documentation tool. Furthermore, the editor or human translator is not getting confused by different styles in case of post-editing.

Translate .md/.mdx Files

A Markdown file contains multiple parts that need to be localized. Other parts - like code segments and frontmatter fragments (meta data) - need to be excluded from translation.

Not translated:

  • Code Fences
  • Emoji's
  • Frontmatter
  • Math Expressions
  • MDX (not yet supported, drop us a line if you like to use it)

Translated:

  • Headers (atx, setext)
  • Paragraphs with bold, italic styles, links, images
  • List Items
  • Table Headers
  • Table Entries
  • ToDo List Entries
  • Footnotes (partial, #fn-1 instead of ^1)

We have plans to improve the Markdown translation tool even more. Quick Roadmap:

  1. We want to handle internal links correctly (adapt to translated result)
  2. Handle footnote links
  3. Adapt the Simpleen CLI to support Markdown files in your local project

Discussion (0)