DEV Community

loading...
Cover image for The Complete Guide to Localizing your App with JavaScript's Internationalization API

The Complete Guide to Localizing your App with JavaScript's Internationalization API

OpenReplay Tech Blog
Tech blog for Asayer.io. Quality content by developers for developers interested in JavaScript and related front-end technologies.
Originally published at blog.openreplay.com ・17 min read

by author Craig Buckler

Like most native-English web developers, I consider internationalization (i18n) too late. English seems ubiquitous, yet it's spoken by fewer than one in seven people worldwide. It's the first language of 379 million people but:

  • 917 million speak Mandarin Chinese
  • 460 million speak Spanish, and
  • 341 million speak Hindi.

Non-English speakers often live in emerging markets enjoying exponential internet growth driven by inexpensive mobile devices and connectivity. The JavaScript Internationalization API (i18n) can help localize your site or app and increase its exposure to massive global populations (try the localization tool).

Internationalization seems simple...

...until you try it.

Tools such as Google Translate and Microsoft Translator are increasingly viable for language translation. Older systems translated words separately without any context which could mangle sentences. Modern translators use machine learning techniques to analyse key phrases and the results are noticeably better. Unless you have unusual terminology, it may be possible to auto-translate web articles then pay an editor to double-check the language.

Web apps often use far less text than articles so the process should be easier. In some cases, you can avoid text altogether. If someone is rating your app, most will understand the meaning of emojis such as ❤️, 😊, 👍, or 💩.

Your app can also use pictorial icons to convey meaning or functionality. For example, everyone recognises this icon:

Not the hamburger icon!

You click it when using a word processor to justify text so it aligns with both the left and right margins.

Be honest: you assumed the icon represented a hamburger menu!

Similarly, what would you expect this icon to do?...

Save icon

It's commonly used to save data, yet few people under the age of 30 would have ever seen a floppy disk. The PC revolution was predominantly a western phenomenon which bypassed much of the world where mobile phones reign supreme.

The point is that words will always be necessary to avoid confusion caused by our pictographic complacency! One simple development option is tokenization in your app templates, e.g.

<label for="email">{{ EMAIL }}</label>
<input type="email" id="email" name="email" placeholder="{{ EMAIL }}" />
Enter fullscreen mode Exit fullscreen mode

The word "email" replaces the {{ EMAIL }} token when it's presented to an English-speaking user. Turkish speakers get "e-posta", Samoan speakers get "imeli" and so on.

Users can have their language determined by a user setting. Alternatively, server-side code can detect the browser's HTTP ACCEPT-LANGUAGE request header and pull the correct language from a JSON file or similar.

Please don't set a user's language based on their physical position determined from an IP address or Geolocation API look-up. The current country does not always determine the language -- consider Canada, Switzerland, and India. Russia alone has 24 official languages! The user could also be travelling or using a proxy server.

The gettext library takes tokenization to another level by allowing you to identify strings in your code. It creates language files that an interpreter can edit and put back into the application.

Language tokenization helps if you consider it from the start although it can be difficult to retrofit an existing app.

Unfortunately, this is the start of your problems. Latin-based languages can be superficially similar. Consider a sign-up form requesting your name, email, credit card, and a date:

  • In Spanish: nombre, email, tarjeta de crédito, fecha
  • In French: nom, e-mail, carte de crédit, date
  • In German: name, email, Kreditkarte, datum

It looks simple, but...

There can be variations of the same language

The Spanish spoken in Spain is not identical to that spoken in South America. The Portuguese spoken in Brazil has diverged from that spoken in Portugal. Chinese has at least seven different dialects.

Further token files can overcome this problem but the scale of the challenge may be greater than you expect.

Plurals are difficult

You can pluralise English nouns by adding an 's', for example: 1 cat, 2 cats, 3 cats. Polish, Czech, Latvian, and other Eastern European languages can have up to seven different noun types depending on the quantity.

You may be able to avoid pluralisation issues with evasive techniques. For example, rather than displaying:

"You have ordered 3 tins of cat food"

you opt for:

"tins of cat food ordered: 3"

Language string lengths

Word string lengths can be considerably shorter or longer than English. The word "email" translates to "tölvupóstur" in Icelandic and "электронное письмо" in Russian. You must ensure these strings fit into your English-based UI.

Alternative text orientations

Text is not always oriented from left to right:

  • Arabic, Hebrew, Kurdish, and Yiddish use right to left
  • Chinese, Korean, Japanese, and Taiwanese use top to bottom

Newer CSS properties such as direction, writing-mode, text-orientation, text-combine-upright, unicode-bidi, and logical block/inline dimensions will help. Designing a flexible UI that works in all directions is another matter.

Data formatting

Finally, problems still arise when your app targets English-speaking locales and predominantly shows data rather than text. Consider this string:

Your next billing date: 12/03/24

The date slash delimiter isn't commonly-used everywhere but we'll presume most people understand it.

Users could read it as:

  • "3 December 2024" in US locales that use M-D-Y format
  • "12 March 2024" in British, European, South American, and Asian locales that use D-M-Y format, and
  • "24 March 2012" in Canadian, Chinese, Japanese, and Hungarian locales that use the considerably more practical Y-M-D format (the most significant digits come first so it's easier to sort).

Or consider your next billing cycle:

Next bill: 1,000 crypto-units (or whatever currency you're using)

Users could read it as:

  • "one thousand" in the US, UK, Canada, China, and Japan, but
  • those in Spain, France, Germany, and Russia get the bargain price of "1" because a comma denotes the number's decimal fraction.

Even if you avoid all numeric punctuation and create an exercise app for English speakers...

Today's target: 10000 meters

This translates to 6.2 miles in the USA. Those in the UK, Canada, and Australia would expect 10 thousand measuring instruments - a "meter" rather than a "metre"!

Admittedly, few people would be confused because the context is clear and non-US English speakers commonly see US terms. That won't always be the case!

It's easy to gloss over these problems, cause confusion, and increase your user support costs. Fortunately, the JavaScript Intl API can resolve some common data formatting issues...

The JavaScript Intl API

The ECMAScript Internationalization API has good support in:

  1. All modern browsers. Even IE11 supports 93% of the standard with Safari on iOS reaching 96%.
  2. Node.js 14 and above, but partial support was available from the start.
  3. Deno 1.8 and above.

The Intl API has been somewhat unstable and can be a little confusing but it's usable today. If necessary, you can check for browser support with:

if (window.Intl) {
  // Intl API is available
}
Enter fullscreen mode Exit fullscreen mode

A Intl polyfill is also available but you're unlikely to need it.

The API provides automatic localization of:

  1. dates and times
  2. relative periods such as yesterday and tomorrow
  3. numbers including currencies, percentages, and units
  4. names of languages, regions, scripts, and currencies
  5. lists with conjunctions (and) and disjunctions (or)
  6. plural helpers
  7. string comparisons, such as equating a with á

The following example returns a string containing the current date and time in your local format:

new Intl.DateTimeFormat(
  [],
  { "dateStyle": "short", "timeStyle": "short" }
).format();
Enter fullscreen mode Exit fullscreen mode

That could be:

  • "8/31/21, 1:23 PM" in the US
  • "31/08/2021, 13:23" in the UK, or
  • "2021/08/31 13:23" in Japan.

Set a locale

All Intl objects require a locales argument which identifies:

  • a language subtag
  • (optionally) a script subtag
  • (optionally) a region or country subtag
  • (optionally) one or more unique variant subtags
  • (optionally) one or more BCP 47 extension sequences, and
  • (optionally) a private-use extension sequence

The API will almost certainly support every language you require -- even Klingon! In most cases, a language ("en") or a language and region ("en-US", "en-GB", "en-CA") is adequate:

const us = new Intl.Locale('en-US');
Enter fullscreen mode Exit fullscreen mode

A second options object can set further parameters, e.g.

const us = new Intl.Locale(
  'en',
  { region: 'US', hourCycle: 'h12', calendar: 'gregory' }
);
Enter fullscreen mode Exit fullscreen mode

All Intl objects accept a Locale object as the first parameter:

const
  usa = new Intl.Locale('en-US'),
  now = new Intl.DateTimeFormat(usa, { "timeStyle": "short" }).format();
Enter fullscreen mode Exit fullscreen mode

although you can define a locale as a string alone (as used in most of the examples below):

const now = new Intl.DateTimeFormat('en-US', { "timeStyle": "short" }).format();
Enter fullscreen mode Exit fullscreen mode

An empty array ([]) denotes the user's current locale:

const now = new Intl.DateTimeFormat([], { "timeStyle": "short" }).format();
Enter fullscreen mode Exit fullscreen mode

Open Source Session Replay

Debugging a web application in production may be challenging and time-consuming. OpenReplay is an Open-source alternative to FullStory, LogRocket and Hotjar. It allows you to monitor and replay everything your users do and shows how your app behaves for every issue.
It’s like having your browser’s inspector open while looking over your user’s shoulder.
OpenReplay is the only open-source alternative currently available.

OpenReplay

Happy debugging, for modern frontend teams - Start monitoring your web app for free.

Date and time localization

You can format date and times with the Intl.DateTimeFormat() object initialized with:

  1. A locale object, string, or empty array for the user's current locale, and
  2. An options object.

Use the datetime builder tool...

The tool generates code for example dates and times with the most useful settings. The full list of options properties:

property description
timeZone time zone: "UTC", "America/New_York" "Europe/Paris" etc.
calendar options include: "chinese" "gregory" "hebrew" "indian" "islamic" etc.
numberingSystem numbering system: "arab" "beng" "fullwide" "latn" etc.
localeMatcher locale matching algorithm: "lookup" "best fit"
formatMatcher format matching algorithm: "basic" "best fit"
hour12 set true to use 12-hour time notation
hourCycle hour cycle: "h11" "h12" "h23" "h24"
dateStyle the date style: "full" "long" "medium" "short"
weekday weekday format: "long" "short" "narrow"
day day format: "numeric" "2-digit"
month month format: "numeric" "2-digit" "long" "short" "narrow"
year year format: "numeric" "2-digit"
era era format (AD, BC): "long" "short" "narrow"
timeStyle the time style: "full" "long" "medium" "short"
hour hour format: "numeric" "2-digit"
minute minute format: "numeric" "2-digit"
second second format: "numeric" "2-digit"
dayPeriod period expressions (morning, night, noon): "narrow" "short" "long"
timeZoneName (UTC, PTC) either: "long" "short"

Like most Intl objects, DateTimeFormat objects can generate any number of localized strings using a format() method which accepts a standard JavaScript Date(). For example:

// US short dates
const dt = new Intl.DateTimeFormat(
  "en-US", { dateStyle: "short" }
);

console.log(dt.format( new Date('2022-05-03') )); // "5/3/21"
console.log(dt.format( new Date('2022-05-04') )); // "5/4/22"
Enter fullscreen mode Exit fullscreen mode

Note: the new TC39 Temporal object will provide an alternative to the old and clunky Date() object in future editions of JavaScript. Whether it's supported by Intl methods at the same time is another matter.

More .format() method examples:

// US short date and time: "5/4/22, 1:00 PM"
new Intl.DateTimeFormat(
    "en-US", { dateStyle: "short", timeStyle: "short" }
  ).format( new Date("2022-05-04T13:00") );

// UK long date, short time: "4 May 2022 at 13:00"
new Intl.DateTimeFormat(
    "en-GB", { dateStyle: "long", timeStyle: "short" }
  ).format( new Date("2022-05-04T13:00") );

// Japanese short date, no time: "2022/05/04"
new Intl.DateTimeFormat(
    "ja-JP", { dateStyle: "short" }
  ).format( new Date("2022-05-04T13:00") );

// Spanish full date and time
// "miércoles, 4 de mayo de 2022, 13:00:00 (hora de verano británica)"
// this will change depending on your local timezone
new Intl.DateTimeFormat(
    "es-ES", { dateStyle: "full", timeStyle: "full" }
  ).format( new Date("2022-05-04T13:00") );

// French custom date and time: "mardi 04 mai 2021 à 13 h"
new Intl.DateTimeFormat(
    "fr-FR",
    {
      "weekday": "long",
      "day": "2-digit",
      "month": "long",
      "year": "numeric",
      "hour": "2-digit"
    }
  ).format( new Date("2022-05-04T13:00") );
Enter fullscreen mode Exit fullscreen mode

Other Intl.DateTimeFormat methods include:

  1. .formatToParts()

Returns an array of objects containing the formatted date in name/value pairs such as { type: "weekday", value: "Monday" }.

  1. .formatRange(startDate, endDate)

Returns a concise date range, such as "01/10/2022, 10:00 AM - 12:00 PM".

  1. .formatRangeToParts()

Returns an array of objects containing the formatted date range in name/value pairs.

  1. .resolvedOptions()

Returns an object with properties reflecting the computed formatting options for the locale, date, and time.

Relative period localization

The Intl.RelativeTimeFormat() object generates human-friendly period strings such as "yesterday", "tomorrow", "next week", "last month", "2 years". Initialize it with:

  1. A locale object, string, or empty array for the user's current locale, and
  2. An options object.

Use the relative period builder tool...

The tool generates code for example periods using options properties:

property description
numeric either "always" for "1 day ago" or "auto" for "yesterday"
style format: "long" "short" "narrow"

More .format() method examples:

// en-US: "tomorrow"
new Intl.RelativeTimeFormat(
  "en-US",
  { "numeric": "auto" }
).format( 1, "day" );

// fr-FR: "demain"
new Intl.RelativeTimeFormat(
  "fr-FR",
  { "numeric": "auto" }
).format( 1, "day" );

// ru-RU: "завтра"
new Intl.RelativeTimeFormat(
  "ru-RU",
  { "numeric": "auto" }
).format( 1, "day" );
Enter fullscreen mode Exit fullscreen mode

The following methods are also supported:

  1. .formatToParts()

Returns an array of objects containing period name/value pairs.

  1. .resolvedOptions()

Returns an object with properties reflecting the computed formatting options.

Number, currency, percentage, and unit localization

The Intl.NumberFormat() object generates numbers, currencies, percentages, and units such as lengths, temperatures, times, storage, and more. This is one of the most-used options and you initialize it with:

  1. A locale object, string, or empty array for the user's current locale, and
  2. An options object.

Use the unit builder tool...

The tool generates code for example units using options properties:

property description
style formatting: "decimal" "currency" "percent" "unit". Other properties are set according to this value
notation type: "standard" "scientific" "engineering" "compact"
numberingSystem options include "arab" "beng" "deva" "fullwide" "latn" etc.
minimumIntegerDigits minimum number of integer digits
minimumFractionDigits minimum number of fraction digits
maximumFractionDigits maximum number of fraction digits
minimumSignificantDigits minimum number of significant digits
maximumSignificantDigits maximum number of significant digits
signDisplay display the +/- sign: "auto" "never" "always" "exceptZero"
useGrouping set false to disable thousands separators
compactDisplay format when using compact notation: "long" "short"
currency currency code when using currency style: "USD" "EUR" "GBP" etc.
currencyDisplay currency formatting when using currency style: "symbol" "narrowSymbol" "code" "name"
currencySign format negative values when using currency style: "standard" "accounting"
unit a unit type when using unit style: "centimeter" "inch" "hour" "byte" etc.
unitDisplay unit format: "long" "short" "narrow"

More .format() method examples:

// en-US: "1,234.56"
new Intl.NumberFormat(
  "en-US",
  {}
).format( 1234.56 );

// de-DE: "1.234,56"
new Intl.NumberFormat(
  "de-DE",
  {}
).format( 1234.56 );

// en-CA: "US$1,234.56"
new Intl.NumberFormat(
  "en-CA",
  { "style": "currency", "currency":"USD" }
).format( 1234.56 );

// fr-FR: "123 456 %"
new Intl.NumberFormat(
  "fr-FR",
  { "style":"percent" }
).format( 1234.56 );

// en-GB: "1,234.560°C"
new Intl.NumberFormat(
  "en-GB",
  { "style": "unit", "unit": "celsius", minimumFractionDigits: 3 }
).format( 1234.56 );
Enter fullscreen mode Exit fullscreen mode

The following methods are also supported:

  1. .formatToParts()

Returns an array of objects containing period name/value pairs.

  1. .resolvedOptions()

Returns an object with properties reflecting the computed formatting options.

Language, region, and currency name localization

The Intl.DisplayNames() object generates names for languages, scripts, regions, and currencies. You would typically use it to create language or similar selectors (English, inglés, anglais etc.) Initialize it with:

  1. A locale object, string, or empty array for the user's current locale, and
  2. An options object.

Use the name builder tool...

The tool generates code for example names using options properties:

property description
type either "language" "region" "script" "currency"
style format: "long" "short" "narrow"
fallback either: "code" "none"

The .of() method returns a string according to a code passed. Examples:

// French language in Italian: "francese (Francia)"
new Intl.DisplayNames(
  "it-IT",
  { "type": "language" }
).of( "fr-FR" );

// Egyptian hieroglyphs in German: "Ägyptische Hieroglyphen"
new Intl.DisplayNames(
  "de-DE",
  { "type": "script" }
).of( "Egyp" );

// Australia in French: "Australie"
new Intl.DisplayNames(
  "fr-FR",
  { "type": "region" }
).of( "AU" );

// British Pounds in Polish: "funt szterling"
new Intl.DisplayNames(
  "pl-PL",
  { "type": "currency" }
).of( "GBP" );
Enter fullscreen mode Exit fullscreen mode

A .resolvedOptions() method is also available which returns an object with properties reflecting the computed formatting options.

List localization

The Intl.ListFormat() object can format an array of items into a localized list which uses a conjunction (an "and" before the last item in English) or a disjunction (an "or" before the last item in English).

Note: it cannot localize the text of the items in the list.

Initialize the object with:

  1. A locale object, string, or empty array for the user's current locale, and
  2. An options object.

Use the list builder tool...

The tool generates code for example lists using options properties:

property description
type output format: "conjunction" (and), "disjunction" (or), "unit" (none)
style formatting: "long" "short" "narrow"

More .format() method examples:

const browsers = ["Chrome", "Firefox", "Safari"];

// US (and): "Chrome, Firefox, and Safari"
new Intl.ListFormat(
  "en-US",
  { "type": "conjunction" }
).format( browsers );

// Italian (or): "Chrome, Firefox, o Safari"
new Intl.ListFormat(
  "en-US",
  { "type": "disjunction" }
).format( browsers );

// Russian (unit): "Chrome Firefox Safari"
new Intl.ListFormat(
  "ru-RU",
  { "type": "unit" }
).format( browsers );
Enter fullscreen mode Exit fullscreen mode

A .formatToParts() method is also available which returns an array of objects containing list name/value pairs.

Plural localization

The Intl.PluralRules() object helps determine the pluralization of quantities. Initialize the object with:

  • A locale object, string, or empty array for the user's current locale, and
  • An options object which sets a type property:
    • cardinal: a quantity of items (the default), or
    • ordinal: a ranking of items, e.g. 1st, 2nd, or 3rd in English.

Use the plural builder tool...

The tool generates code for the input number and uses the select() method which returns an English string representing the pluralization. This will either be "zero", "one", "two", "few", "many", or "other". Examples:

// US 0 cardinal: "other"
new Intl.PluralRules("en-US", { type: "cardinal" }).select(0);

// US 1 cardinal: "one"
new Intl.PluralRules("en-US", { type: "cardinal" }).select(1);

// US 2 cardinal: "other"
new Intl.PluralRules("en-US", { type: "cardinal" }).select(2);

// US 3 cardinal: "other"
new Intl.PluralRules("en-US", { type: "cardinal" }).select(3);

// US 0 ordinal: "other"
new Intl.PluralRules("en-US", { type: "ordinal" }).select(0);

// US 1 ordinal: "one"
new Intl.PluralRules("en-US", { type: "ordinal" }).select(1);

// US 2 ordinal: "two"
new Intl.PluralRules("en-US", { type: "ordinal" }).select(2);

// US 3 ordinal: "few"
new Intl.PluralRules("en-US", { type: "ordinal" }).select(3);
Enter fullscreen mode Exit fullscreen mode

In English:

  • if a cardinal value returns "other", you would typically add an "s" to the end of the noun
  • if an ordinal value returns "two", you would add "nd" to the number, i.e. "2nd", "22nd", "122nd", etc.

A .resolvedOptions() method is also available which returns an object with properties reflecting the computed formatting options.

Comparison localization

The Intl.Collator() object enables language-sensitive string comparisons. It's typically used when sorting or searching arrays of stings where you want:

  • a, á, ä to equate - or not equate - to the "same" character, or
  • strings containing numbers are sorted so "1" < "2" < "10" (a standard sort() returns the order ["1", "10", "2"])

Initialize the object with:

  1. A locale object, string, or empty array for the user's current locale, and
  2. An options object.

Use the comparison builder tool...

The tool generates code which compares two values using the compare() method. The options object can set the following properties:

property description
usage whether comparison is for "sort" (default) or "search"
sensitivity "base" "accent" "case" "variant" comparisons
collation variant collation for certain locales
numeric set true for numeric comparisons
ignorePunctuation set true to ignore punctuation
caseFirst either "upper" or "lower" case first

The method will return either:

  • 0: the two strings are equivalent
  • -1: the first string is "less" than the second, or
  • 1: the first string is "more" than the second.

Exmaples:

// en-US base letter sort: returns 0 (same)
new Intl.Collator(
  "en-US",
  { "sensitivity": "base" }
).compare( "a", "á" );

// en-US standard sort: returns 1 ("2" > "10")
new Intl.Collator(
  "en-US",
  {}
).compare( "2", "10" );

// en-US numeric sort: returns -1 ("2" < "10")
new Intl.Collator(
  "en-US",
  { "numeric": true }
).compare( "2", "10" );
Enter fullscreen mode Exit fullscreen mode

A .resolvedOptions() method is also available which returns an object with properties reflecting the computed formatting options.

Localizing an existing app

You can use the Intl API server-side if your app is developed in Node.js or Deno. Other languages will have similar libraries although their output strings may differ.

If your application presentation logic resides client-side using a framework such as React or Vue.js, it may be possible to format values returned from calculations or Ajax requests before they are output.

Finally, if your server returns HTML with unformatted or English-based values, you could dynamically update the DOM with localized strings once the page is ready. Consider the following HTML which shows two hard-coded dates in US English format:

<p>Today: <time datetime="2022-05-04">05/04/2022</time><p>

<p>Last login: <time datetime="2021-09-05">09/05/2022</time><p>
Enter fullscreen mode Exit fullscreen mode

We can use the datetime attributes to reformat date text to the user's current locale:

window.addEventListener('DOMContentLoaded', () => {

  const
    time = document.getElementsByTagName('time'),
    dt = new Intl.DateTimeFormat([], { "dateStyle": "long" });

  Array.from(time)
    .forEach(t => {
      t.textContent = dt.format(new Date(t.getAttribute('datetime')));
    });

});
Enter fullscreen mode Exit fullscreen mode

Spanish users would then see:

Today: 4 de mayo de 2022

Last login: 5 de septiembre de 2021

View example code on CodePen...

Other text is not translated but the dates become more understandable. You can apply similar DOM updates to numbers and other strings identified in the HTML.

It's possible to recurse the whole DOM and translate any dates, times, numbers, percentages, currencies, etc. you encounter. Be wary this could incur significant processing time on larger pages which would become unresponsive after they load.

Conclusion

Internationalization is not easy and the JavaScript Intl API won't solve all your localization issues. However, it will help users read and understand the data displayed on your site or app. You'll be one step closer to global domination!

Discussion (0)