DEV Community

What do American developers get wrong about internationalization?

Ben Halpern on January 19, 2017

Collapse
 
mauskin profile image
Kirill Myshkin

English

// 1  : thing
// >1 : things
Enter fullscreen mode Exit fullscreen mode

Russian

// 1        : штука
// 2 to 4   : штуки
// 5 to 20  : штук
// 21       : штука
// 22 to 24 : штуки
// 25 to 30 : штук
// 31       : штука
...
Enter fullscreen mode Exit fullscreen mode
Collapse
 
eradical profile image
Gabriel PREDA

Romanian

//   1        : tigru    [ tiger ]
//   2 to  19 : tigri    [ tigers ]
//  20 to 100 : de tigri [ tigers ]
// 101 to 119 : tigri    [ tigers ]
// 120 to 200 : de tigri [ tigers ]
...
Enter fullscreen mode Exit fullscreen mode

The formula for PO files we use is:

Plural-Forms: nplurals=3;plural=(n==1?0:(((n%100>19)||((n%100==0)&&(n!=0)))?2:1))\n
Enter fullscreen mode Exit fullscreen mode
Collapse
 
machkernel profile image
David Stancu

What about "barosane"?

Collapse
 
ahmedam55 profile image
Ahmed Mahmoud

Arabic

// 1  شيئ
// 2 شيئان
// 3 أشياء
Enter fullscreen mode Exit fullscreen mode

not to mention that Arabic is gender-sensitive language

Collapse
 
renae_jones profile image
Renae Jones

I work in web development and my encounters really reflect that.

  1. Assume everything is a translation of English. Create situations where even if you don't need the application in English, you're supposed to enter the English strings, then translate.

  2. Can't cope with an administrative user using a different language for UI than the content they're entering.

  3. Assume one translation per language is enough. This causes problems when multiple countries use the same official language but have tiny dialect differences, different currencies, etc, ie. most of South America and everywhere once colonized by Brits.

  4. Assume language dictates location, ie. only showing locations in Germany to someone with a German system language.

  5. Don't change fonts or font sizes/styling between languages. Reading Chinese or Japanese with styles created for English, it's too small, too close together, and fonts like Arial mostly support the characters but look bad.

  6. Forget to mock up designs in multiple languages. Tabs, navigation that can't wrap, and narrow sidebar menus are usually doomed. Chinese and german words can be very very long compared to English. French sometimes uses long phrases for simple concepts--a UI link might be one word in English but like 6 words after translated to French.

  7. Use flags for anything. Some flags are a political landmine in their country.

  8. Hard restricting people to the language configured as system language or browser language. This should be a suggestion (ie. that's the default but you can change it). If not, it becomes a nightmare to test and also leads to frustrating edge cases for certain people, ie. someone who prefers German but needs to get a link to the English content for their colleague. Or someone who speaks English trying to use Google Translate to get shop hours while traveling in Japan.

Collapse
 
sapegin profile image
Artem Sapegin

Opposite of 4 is also true: assuming that I speak German only because my IP is German is pretty naive.

Just two example why this is wrong:

  1. I live in Germany but don’t speak German and don’t even have it in Accept Languages list in the browser settings. Keep seeng German versions of sites.

  2. We have a weird network configuration in the office and many sites think I’m in the Netherlands.

Collapse
 
dmitriid profile image
Dmitrii 'Mamut' Dimandt

Waze in notorious in this regard. Travel from Sweden to Germany by plane. Turn your phone on: ooooh, let's download German voices while you are roaming. Also, let's not let you switch back to Swedish/English/whatever your previous language was.

Collapse
 
sohayb profile image
Sohayb Hassoun

Same thing is happening with me in Japan when I use Amazon Prime Video app, the region is Japan but the language is English, yet the app is in Japanese and there's no way I can switch languages! (I know 0 Japanese)

Collapse
 
davewallace profile image
Dave Wallace

Yep, or even just working at a distributed site or on vacation, making use of local networks.

Collapse
 
ben profile image
Ben Halpern

Wow, number 4 is shocking. That just seems so fundamentally misguided as a way to go about developing for real users of the world.

Collapse
 
renae_jones profile image
Renae Jones

It sounds pretty dumb when I generalize it that way, but when I've actually run into it, it made perfect sense how the developers got where they did.

I think there it's technical particulars that lead to this scenario. For web, people don't like to choose their language or country manually. And also for web, there will be multiple different applications kind of strung together through APIs, etc. But the two pieces of information available to everyone is either browser language or geoIP/location. A lot of problems come from one app using browser language and another app using location. And even when you make a bad choice about which to use, it still works for 98%+ of users, so it can be a while until you get a complaint.

Thread Thread
 
ben profile image
Ben Halpern

Yeah, I figured there was more to it. There are, of course, always is tradeoffs and existing constraints. Really helpful list, either way. 👌

Collapse
 
maryannepeters profile image
Macpeters

Geofencing makes it almost impossible to access foreign language versions of any of the major sites. I can't get German music in Canada, because I'm directed to the Canadian site, which doesn't sell German music. Same goes for amazon, netflix, and a whole list of others.

Thread Thread
 
sapegin profile image
Artem Sapegin

They may just have no rights to sell German music in Canada. Of course this should be independent from the UI language.

Collapse
 
tmr232 profile image
Tamir Bahar

I'll try and give some actual examples I encountered, without naming names.

  1. i18n is applied almost everywhere. So you can use non-Latin letters in content, but not in search or tags.
  2. To support RTL, the devs decided to flip the entire UI. Text included. This makes no sense whatsoever.
  3. Images, or image editors, are flipped to support RTL. Resulting in unusable displays.
  4. System language is used for some parts of the software, where a user configured language is used for the rest. So RTL layout is used with an LTR language, or vice versa.
  5. Semantic similarity between languages is assumed. So duplicate UI strings are only translated once. This causes really silly UI with words such as "set" and "read", when different tenses are mixed up.
  6. Date-time formats are abused. Both DD/MM/YY vs MM/DD/YY, and 24 vs. 12 hour clocks. Often you can even set those settings, but only for some displays. Being unable to tell which times are shown in different places in the UI is absolutely terrible.
  7. Fonts. If you support a language, support its display as well. And support it all over your program. Also, if your text inputs allow for more than one language, make sure you have a font for every language enabled at the same time. MS Office does a great job at it.
  8. Keyboard shortcuts. It is not uncommon that I have to switch back from Hebrew to English to type a keyboard shortcut. This is especially annoying for Undo. And really, all I care about is key locations. I don't want Ctrl+Z and Ctrl+ז to act differently.

There is one more, but I admit it is more difficult as there is no obvious solution

  1. BiDi text. I often mix English and Hebrew. No code editor handles that well.
Collapse
 
ben profile image
Ben Halpern

Guilty of a lot of this, but really excited to improve.

Collapse
 
aurelherve profile image
Aurélien Hervé

They require a "state" field in their forms.

Collapse
 
oliver_bock profile image
Oliver Bock

Oh. How I hate this. (I'm from Germany and here no one ever asks this)

Collapse
 
oscherler profile image
Olivier “Ölbaum” Scherler

It’s not limited to American developers, but:

  1. Force (not select the default, force) the language based on the country of the IP address (e.g. eBay). Which means that the 20% of French-speaking citizens of Switzerland are served a page in German, and people on holiday in a country they don’t speak the language can’t use the site;

  2. Limit the available languages too much based on the country it serves. Until recently, I couldn’t have Amazon.de in anything other than German, even though it’s the preferred country for Swiss users (there’s not Amazon.ch). English would have been useful;

  3. Set 42 cookies to maximise turning visitors into products, but cannot be bothered to remember the language they selected between visits;

  4. Assign the label “special character” to anything that is not A-Z, preventing you to type in your name or address.

Collapse
 
meilon profile image
Christian Arnold

Oh, the cookie thing bothers me, too! That's why I wrote a Tampermonkey script to go to the english version of Microsoft KB/Technet/MSDN sites.

Who thinks that an automated translation of a technical document helps?

Collapse
 
vovanz profile image
vovanz

Force (not select the default, force) the language based on the country of the IP address (e.g. eBay).

This is so stupid. Accept-Language header was invented for a reason.

Collapse
 
punio4 profile image
Ivan Čurić

1) Translating strings without context. Google is notorious for this. For instance, in the Android Google app, "Search Language" is translated as "Language Searching" in Croatian.

2) Address fields: No, I don't have states.

3) Time/Date formatting: AM/PM + MM / DD / YYYY. Assuming that my week starts on Sunday.

4) Keyboard shortcuts assume you're using a US keyboard. It's not just the Y character position that's the issue, but other characters such [ ]< >: ; _ and the likes which are on another layer.

5) Forcing languages based on IP address. No, I don't want to read the gimped, half-assed local version of the site. Doing redirects is even worse.

6) Pluralization is dead easy in English, however it's often much more complex in other languages. Same goes for cases, which don't really exist in English.

Collapse
 
ignasi35 profile image
Ignasi Marimon-Clos

Assume (name) (middle name) (surname) format. Some countries use (name) (father surname) (mother surname).

Assume all women will discard their surname if/when they marry.

Both combined make the question "What was your mother maiden name?" useless. Actually it's the same than now, and is exactly my second surname, which is public record).

Collapse
 
kitsunde profile image
Kit Sunde

Some countries put the last name first (super common in Asia).
A lot of people in Asia don't have last names.
If my country is Singapore, why do I have to select a region and a city? These are all the same thing.

Collapse
 
ennor profile image
Enno Rehling (恩諾)

Encodings, encodings, encodings. I'm considering "Explain Unicode" as an interview question.

Collapse
 
roddi profile image
Ruotger Deecke
  • Imperial measurement units
  • MM-DD-YYYY
  • Fahrenheit
  • AM/PM times

Icon design assumptions:

  • Dollar sign for symbolising money
  • American tin can tunnel style mailboxes
  • yellow note pad paper
Collapse
 
etresoft profile image
John Daniel

I think most of these comments are from really technical people doing really good international work and are annoyed when they see other good people make some minor errors. Most of these complaints seem based on experiences from the best-of-the-best with regards to international support. My list is more basic. This is what all developers, not just Americans, should do:

1) Use Unicode. It is 2017, please stop using Windows code pages. Please use software that at least supports Unicode. Americans aren't the worst culprits here.
2) This one is for the Americans - don't assume all countries have US-style addresses. Just changing "state" to "state/province" and "zip code" to "postal code" is not adequate.
3) For you GIS people, don't assume all countries have a strict US-style administrative boundary hierarchy. Even the US doesn't have the administrative boundaries you think it does.
4) Again for your GIS people, please review point 1 above.
5) For you "data scientists", please review point 1 above. Don't just change your original data to make it ASCII. And if you do, don't give the result to the GIS people to geocode.
6) For other "researchers", please don't refer to these characters are "junk" or "garbage data" in meetings or presentations. You are offending both non-English speakers and people who speak both English and Unicode.

Collapse
 
philthomasme profile image
Phil Thomas

Interesting comments thus far, they certainly shine light onto something that appears important. Internationalization has been on the back-burner for me and it is clear that this is something that requires a decent level of investment to implement and execute correctly.

Throughout several years as a developer, I've worked for companies that employ workers that are fluent in the US English language, so supporting internationalization never had any "value" to the business. Any legacy projects that I have worked on that were used by a non-English speaking team were poorly internationalized. It would have been cheaper to start from scratch than attempt to add multiple language support in correctly.

Great question!

Collapse
 
stuartstuple profile image
Stuart j Stuple

I believe the codes used to report "language" add to this. BCP 47 should be language and dialect but often gets interpreted as language and locale. the word "region" is so ambiguous in that context. And some mobile devices deliberate push the locale into the dialect slot to allow a single look-up value.

Collapse
 
aschei profile image
aschei

MSG_NOMSG=You have no emails.
MSG_ONEMAIL=You have one mail.
MSG_MAILS=You have {number} mails.

Read en.wikipedia.org/wiki/Grammatical_.... There is more than singular and plural in other languages.

mvnrepository.com/artifact/com.ibm... has a solution for that.

Collapse
 
drashmk profile image
Dragan Atanasov
  1. Assume that everyone is using imperial system.
  2. Assume that everyone is in the same timezone - this applies to all devs.
Collapse
 
naor2012 profile image
naor volkovich

They should either support RTL-layout from day 1 or not support it at all! When you use a program and then an update comes out that supports RTL-layout, it's just the worst! Everything you used to click on the right is now on the left and the opposite! We prefer to have LTR-layout then getting used to a new one!

Collapse
 
ghost profile image
Ghost

1.) They forget that non-QWERTY keyboard layouts exist

2.) RTL-languages usually break the UI completely or at least look very out of place

3.) They forget that some cultures don't use , for thousands or . for decimals.

Collapse
 
mshappe profile image
Michael Scott Shappe

We don't do it at all!

Collapse
 
keppla profile image
Benjamin Köppchen

Assuming adding Internationalization is mostly a technical problem.

"[Django|Rails|Whatever] templates support i18n, so we're basically done"

Many of the problems in this thread (ignorance of LTR/RTL, address forms, weird validation rules) have their origin in this frame of thought, imho.

And this is not limited to americans. For all intends and purposes, german seems to be english with longer words and weird decimals. Even our umlauts and the ß are on the latin1. We even have federal states to in the address fields, if they're not selectboxes ;)

Collapse
 
patricktingen profile image
Patrick Tingen

They forget that in some languages most words are longer than in English

Collapse
 
bobend profile image
Bo Bendtsen

Assume first and last names are at least 3 letters or more when validating form inputs.
-Bo

Collapse
 
mikesimons profile image
Mike Simons

They assume all languages pluralize like English.
unicode.org/cldr/charts/29/supplem...

Collapse
 
k2t0f12d profile image
Bryan Baldwin

They forget to do it, until they try to take it into other countries.

Collapse
 
sohayb profile image
Sohayb Hassoun

Well, platform vendors make it hard sometimes. It's not that straight forward to change app local in-App only (talking about Android, iOS)

Collapse
 
jonstodle profile image
Jon Stødle

I don't know about Android, but it's not that hard to override on iOS. You literally write a valid iOS language code to a dictionary.

Collapse
 
sohayb profile image
Sohayb Hassoun

I deleted my comment by mistake! I saw it duplicated, when I deleted the second one, the first one with your reply was deleted as well :/
Anyway, I know it's provided on both platforms, but, IMHO, it's not straight forward.