DEV Community

Cover image for We added Spanish language support and here's what we had to change in our architecture
Arsenii Kozlov
Arsenii Kozlov

Posted on

We added Spanish language support and here's what we had to change in our architecture

In spotsmap.com, from the beginning, we supported three languages, and the architecture of how we implemented multilingual support seemed quite good. However, we kept postponing support for more languages simply because even thinking about it caused us pain.

A few days ago, we took matters into our own hands and reorganized our architecture. This affected all parts of our system: the database, backend, frontend, and email communication. Here, I will describe what we have changed to make adding new languages as easy as possible. As a result, it helped us to add support for the Spanish language without any difficulties.

Also, I will share with you some practices on how to improve localization and the mistakes that you should avoid.


We moved dictionaries from the database to the code repository

Initially, we used the following database scheme for all dictionaries that require localization - in our case, these were countries, sports, student levels, and so on.

Locale (PK id, code, name) <- SportLocale (FK localeId, FK schoolId, name) -> Sport (PK id, masterName)

Here, the table Locale contains supported languages. The table Sport contains all types of sports that are presented on our website, such as kitesurfing, windsurfing, etc. In this table, the field masterName is the default name for each type of sport in English. We use this name when we don't have a translation for a specific language. Lastly, the table SportLocale contains localized names of types of sports for all supported languages.

How the update process looked like. To add a new value or change something in the current values, a developer has to write a migration for the database, which has to be applied to all databases, depending on the development stage. Also, such a migration requires careful review to ensure it does not break the production environment.

The migration can look like:

// Upgrade migration
INSERT INTO "Sport" ("id", "masterName")
VALUES (4, "Sailing");

INSERT INTO "SportLocale" ("localeId", "sportId", "name")
VALUES
(1, 4, "Sailing"),
(2, 4, "Парусный спорт"),
(3, 4, "Segeln"),
(4, 4, "Navegación");

// Downgrade migration
DELETE * FROM "SportLocale" WHERE "sportId" = 4;
DELETE * FROM "Sport" WHERE "id" = 4;
Enter fullscreen mode Exit fullscreen mode

The more languages, the more lines of code, and unfortunately, the more mistakes. Moreover, developers have to prepare translations themselves or wait for them from their teammates, which significantly increases development time.

What we did: We removed all localization tables and dictionary tables from the database, such as Sport and SportLocale. Instead, we added a dictionary JSON file dictionaries/en/sport.json to our backend repository. In the data tables, we replaced the ID of the sport type with its corresponding key.

{
  "kitesurfing": "Kitesurfing",
  "windsurfing": "Windsurfing",
  "sup": "SUP"
}
Enter fullscreen mode Exit fullscreen mode

Now, to add a new sport, a developer only has to add one line in the JSON file, and only for English. Just a one line, really. A bit later, I will explain how we translate dictionaries to all supported languages and how it helps us to easily support new languages.

We added the generation of a description based on structured data instead of using pre-rendered texts

We display a description of a sports school on its page. These pages are generated statically to make them SEO-friendly. Initially, we generated texts when we added a new school to the database. There was a separate description for every supported language, stored in the SchoolLocale table.

Locale (PK id, code, name) <- SchoolLocale (FK localeId, FK schoolId, desription) -> School (PK id)

It was okay when we added a school. We just generated three descriptions at the same time and stored them together with the school. However, when we want to add a new language, it requires generating more than 1000 texts for every school. Moreover, we used different outside services, like Google Places API, to collect more details about the school, and these details can change from time to time. So, adding a new language requires regenerating all descriptions - more than 4000 just for 4 languages. What about adding 40 supported languages, as we aim?

This solution does not look maintainable and easy to scale.

How we changed the approach.

We removed the SchoolLocale table and saved structured JSON data in the School table, which contains all the necessary information for generating a school description, such as nearby hotels, airports, and more. Now, we generate all the pages on the frontend side during the build time (or incremental static generation time).

To achieve this, we added a new localization file public/locales/en/description.json, which looks like:

{
  "sections": {
    "hotels": {
      "description": "{SCHOOL_HOTELS} {POINT_TYPE} {HOTEL_STARS} is {HOTEL_DISTANCE} from the school."
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Now, to add a new language, we just have to add a localized description file, e.g. public/locales/es/description.json. And it doesn't take any time from our developers. Absolutely, now developers don't have to do anything here to support more languages.

We rearranged our multilingual email templates

Before we used conditional statements in our email templates. It looked like this:

...
<h2>
{{#en}}Arrival date:{{/en}}
{{#de}}Ankunftsdatum:{{/de}}
{{#ru}}Дата прибытия:{{/ru}}
</h2>
<p>{{ArrivalDate}}</p>
...
Enter fullscreen mode Exit fullscreen mode

However, we had over 40 localized strings in the template. Even with just three supported languages, this contains more than 120 lines of code solely for template localization. Now, consider the scenario with 10 or 40 languages.

For 40 supported languages, it would involve a staggering 1600 lines of code in the email template. This is hardly maintainable. However, our objective is to support 40 or more languages, aiming to make spotsmap.com accessible to people worldwide.

Moreover, it's hard work for a developer to add support for a new language. As a result, we decided to explore an alternative solution. To tackle these challenges, we moved localization strings from the template to our codebase and adopted a standardized translation process used for website UI localization.

The revised template structure:

...
<h2>{{ArrivalDateTitle}}</h2>
<p>{{ArrivalDate}}</p>
...
Enter fullscreen mode Exit fullscreen mode

We created a JSON file templates/en/bookingConfirmation.json, which houses localization strings for English:

{
  "ArrivalDateTitle": "Arrival date"
}
Enter fullscreen mode Exit fullscreen mode

And again, there is no job for developers to add support for a new language. All we need is a translated JSON file for the corresponding language.

We moved to Gitloc for translating JSON files

From the beginning, it was the responsibility of a product manager to prepare all translations before the task goes to development. Did it work? Actually, no. It would take too much time just to start development, which we couldn't afford. Developers had to manually prepare a lot of translations. Nowadays, you can use ChatGPT to translate the full JSON file. However, what about maintenance when you have to add a few keys in different localization files? Every day. It's a monotonous task that every developer hates. Moreover, it's a low-qualification job, but you pay for it as if it were a highly qualified developer job.

Now we use the only one "developer first" localization platform Gitloc. We connected our repository to Gitloc and added a configuration file gitloc.yaml. For example, there is the gitloc.yaml that is used in our backend repository. We use almost the same file in our frontend repository.

config:
  defaultLocale: en
  locales:
    - en
    - de
    - ru
  directories:
    - dictionaries
Enter fullscreen mode Exit fullscreen mode

Now, to add a new language, developers just have to add one line in the gitloc.yaml file and push it to the remote Git repository. Gitloc detects that a new language has been added and translates all localization files to the corresponding language. It then commits the translations to the repository. Nothing more.

The same process applies for application maintenance. Developers add new keys in localization files, and Gitloc detects these changes and translates them.

Just a single line to support a new language. It's unbelievable.

More localization practices that we learned

Use patterns for localization strings

For example, use School works from {FROM} until {UNTIL} instead of concatenation. It is easier to use and provides more context to translators.

Store localized dictionaries with localization

Instead of storing dictionaries that require translation, such as months, in a configuration file, we have moved them all to the public/locales folder. This allows us to use the same localization methods for UI localization.

Store only localization strings in localization files

Do not store business logic that depends on language, such as date and currency formats, in localization files. It is better to use configuration files or internationalization libraries that can format dates and currencies according to the locale.

Avoid using arrays in localization files

Instead of using arrays, it is better to use key-value pairs. Many i18n libraries recommend this approach. Additionally, using arrays can result in more translations if something changes. For example, adding a new item in the middle of an array would require translating half of the array. However, adding a new key to an existing object only requires translating one new value.

Avoid using leading and trailing spaces in localization strings

Instead of using a localization string like "and": " and ", it is better to use "and": "and" and add necessary spaces in your code. You can even use "description": "{SOMETHING1} and {SOMETHING2}" if possible. This helps avoid translation mistakes, whether you use automated or manual translations.


What was the reason for adding support for Spanish?

We realized that most of our users are located in countries that speak the languages we support. Previously, we supported three languages: German, English, and Russian. The majority of our users were from Germany, Switzerland, Austria, and Russia. However, we also had a small percentage of users from Spain. We are curious to see how adding Spanish will affect the geographic distribution of our users.

I believe that in a couple of months, we will collect enough statistics to compare whether there is any improvement or not. I will definitely share these results with you. Just subscribe to me so you don't miss the outcome of this experiment.

I hope you found something helpful!

Top comments (0)