markdown guide
 

English

// 1  : thing
// >1 : things

Russian

// 1        : штука
// 2 to 4   : штуки
// 5 to 20  : штук
// 21       : штука
// 22 to 24 : штуки
// 25 to 30 : штук
// 31       : штука
...
 

Romanian

//   1        : tigru    [ tiger ]
//   2 to  19 : tigri    [ tigers ]
//  20 to 100 : de tigri [ tigers ]
// 101 to 119 : tigri    [ tigers ]
// 120 to 200 : de tigri [ tigers ]
...

The formula for PO files we use is:

Plural-Forms: nplurals=3;plural=(n==1?0:(((n%100>19)||((n%100==0)&&(n!=0)))?2:1))\n
 
 

Arabic

// 1  شيئ
// 2 شيئان
// 3 أشياء

not to mention that Arabic is gender-sensitive language

 

I work in web development and my encounters really reflect that.

  1. Assume everything is a translation of English. Create situations where even if you don't need the application in English, you're supposed to enter the English strings, then translate.

  2. Can't cope with an administrative user using a different language for UI than the content they're entering.

  3. Assume one translation per language is enough. This causes problems when multiple countries use the same official language but have tiny dialect differences, different currencies, etc, ie. most of South America and everywhere once colonized by Brits.

  4. Assume language dictates location, ie. only showing locations in Germany to someone with a German system language.

  5. Don't change fonts or font sizes/styling between languages. Reading Chinese or Japanese with styles created for English, it's too small, too close together, and fonts like Arial mostly support the characters but look bad.

  6. Forget to mock up designs in multiple languages. Tabs, navigation that can't wrap, and narrow sidebar menus are usually doomed. Chinese and german words can be very very long compared to English. French sometimes uses long phrases for simple concepts--a UI link might be one word in English but like 6 words after translated to French.

  7. Use flags for anything. Some flags are a political landmine in their country.

  8. Hard restricting people to the language configured as system language or browser language. This should be a suggestion (ie. that's the default but you can change it). If not, it becomes a nightmare to test and also leads to frustrating edge cases for certain people, ie. someone who prefers German but needs to get a link to the English content for their colleague. Or someone who speaks English trying to use Google Translate to get shop hours while traveling in Japan.

 

Opposite of 4 is also true: assuming that I speak German only because my IP is German is pretty naive.

Just two example why this is wrong:

  1. I live in Germany but don’t speak German and don’t even have it in Accept Languages list in the browser settings. Keep seeng German versions of sites.

  2. We have a weird network configuration in the office and many sites think I’m in the Netherlands.

 

Same thing is happening with me in Japan when I use Amazon Prime Video app, the region is Japan but the language is English, yet the app is in Japanese and there's no way I can switch languages! (I know 0 Japanese)

 

Waze in notorious in this regard. Travel from Sweden to Germany by plane. Turn your phone on: ooooh, let's download German voices while you are roaming. Also, let's not let you switch back to Swedish/English/whatever your previous language was.

 

Yep, or even just working at a distributed site or on vacation, making use of local networks.

 

Wow, number 4 is shocking. That just seems so fundamentally misguided as a way to go about developing for real users of the world.

 

Geofencing makes it almost impossible to access foreign language versions of any of the major sites. I can't get German music in Canada, because I'm directed to the Canadian site, which doesn't sell German music. Same goes for amazon, netflix, and a whole list of others.

They may just have no rights to sell German music in Canada. Of course this should be independent from the UI language.

 

It sounds pretty dumb when I generalize it that way, but when I've actually run into it, it made perfect sense how the developers got where they did.

I think there it's technical particulars that lead to this scenario. For web, people don't like to choose their language or country manually. And also for web, there will be multiple different applications kind of strung together through APIs, etc. But the two pieces of information available to everyone is either browser language or geoIP/location. A lot of problems come from one app using browser language and another app using location. And even when you make a bad choice about which to use, it still works for 98%+ of users, so it can be a while until you get a complaint.

Yeah, I figured there was more to it. There are, of course, always is tradeoffs and existing constraints. Really helpful list, either way. 👌

 

I'll try and give some actual examples I encountered, without naming names.

  1. i18n is applied almost everywhere. So you can use non-Latin letters in content, but not in search or tags.
  2. To support RTL, the devs decided to flip the entire UI. Text included. This makes no sense whatsoever.
  3. Images, or image editors, are flipped to support RTL. Resulting in unusable displays.
  4. System language is used for some parts of the software, where a user configured language is used for the rest. So RTL layout is used with an LTR language, or vice versa.
  5. Semantic similarity between languages is assumed. So duplicate UI strings are only translated once. This causes really silly UI with words such as "set" and "read", when different tenses are mixed up.
  6. Date-time formats are abused. Both DD/MM/YY vs MM/DD/YY, and 24 vs. 12 hour clocks. Often you can even set those settings, but only for some displays. Being unable to tell which times are shown in different places in the UI is absolutely terrible.
  7. Fonts. If you support a language, support its display as well. And support it all over your program. Also, if your text inputs allow for more than one language, make sure you have a font for every language enabled at the same time. MS Office does a great job at it.
  8. Keyboard shortcuts. It is not uncommon that I have to switch back from Hebrew to English to type a keyboard shortcut. This is especially annoying for Undo. And really, all I care about is key locations. I don't want Ctrl+Z and Ctrl+ז to act differently.

There is one more, but I admit it is more difficult as there is no obvious solution

  1. BiDi text. I often mix English and Hebrew. No code editor handles that well.
 

Guilty of a lot of this, but really excited to improve.

 
 

Oh. How I hate this. (I'm from Germany and here no one ever asks this)

 

It’s not limited to American developers, but:

  1. Force (not select the default, force) the language based on the country of the IP address (e.g. eBay). Which means that the 20% of French-speaking citizens of Switzerland are served a page in German, and people on holiday in a country they don’t speak the language can’t use the site;

  2. Limit the available languages too much based on the country it serves. Until recently, I couldn’t have Amazon.de in anything other than German, even though it’s the preferred country for Swiss users (there’s not Amazon.ch). English would have been useful;

  3. Set 42 cookies to maximise turning visitors into products, but cannot be bothered to remember the language they selected between visits;

  4. Assign the label “special character” to anything that is not A-Z, preventing you to type in your name or address.

 

Oh, the cookie thing bothers me, too! That's why I wrote a Tampermonkey script to go to the english version of Microsoft KB/Technet/MSDN sites.

Who thinks that an automated translation of a technical document helps?

 

Force (not select the default, force) the language based on the country of the IP address (e.g. eBay).

This is so stupid. Accept-Language header was invented for a reason.

 

1) Translating strings without context. Google is notorious for this. For instance, in the Android Google app, "Search Language" is translated as "Language Searching" in Croatian.

2) Address fields: No, I don't have states.

3) Time/Date formatting: AM/PM + MM / DD / YYYY. Assuming that my week starts on Sunday.

4) Keyboard shortcuts assume you're using a US keyboard. It's not just the Y character position that's the issue, but other characters such [ ]< >: ; _ and the likes which are on another layer.

5) Forcing languages based on IP address. No, I don't want to read the gimped, half-assed local version of the site. Doing redirects is even worse.

6) Pluralization is dead easy in English, however it's often much more complex in other languages. Same goes for cases, which don't really exist in English.

 

Assume (name) (middle name) (surname) format. Some countries use (name) (father surname) (mother surname).

Assume all women will discard their surname if/when they marry.

Both combined make the question "What was your mother maiden name?" useless. Actually it's the same than now, and is exactly my second surname, which is public record).

 

Some countries put the last name first (super common in Asia).
A lot of people in Asia don't have last names.
If my country is Singapore, why do I have to select a region and a city? These are all the same thing.

 

Encodings, encodings, encodings. I'm considering "Explain Unicode" as an interview question.

 

MSG_NOMSG=You have no emails.
MSG_ONEMAIL=You have one mail.
MSG_MAILS=You have {number} mails.

Read en.wikipedia.org/wiki/Grammatical_.... There is more than singular and plural in other languages.

mvnrepository.com/artifact/com.ibm... has a solution for that.

 

I believe the codes used to report "language" add to this. BCP 47 should be language and dialect but often gets interpreted as language and locale. the word "region" is so ambiguous in that context. And some mobile devices deliberate push the locale into the dialect slot to allow a single look-up value.

 

Interesting comments thus far, they certainly shine light onto something that appears important. Internationalization has been on the back-burner for me and it is clear that this is something that requires a decent level of investment to implement and execute correctly.

Throughout several years as a developer, I've worked for companies that employ workers that are fluent in the US English language, so supporting internationalization never had any "value" to the business. Any legacy projects that I have worked on that were used by a non-English speaking team were poorly internationalized. It would have been cheaper to start from scratch than attempt to add multiple language support in correctly.

Great question!

 
  • Imperial measurement units
  • MM-DD-YYYY
  • Fahrenheit
  • AM/PM times

Icon design assumptions:

  • Dollar sign for symbolising money
  • American tin can tunnel style mailboxes
  • yellow note pad paper
 

I think most of these comments are from really technical people doing really good international work and are annoyed when they see other good people make some minor errors. Most of these complaints seem based on experiences from the best-of-the-best with regards to international support. My list is more basic. This is what all developers, not just Americans, should do:

1) Use Unicode. It is 2017, please stop using Windows code pages. Please use software that at least supports Unicode. Americans aren't the worst culprits here.
2) This one is for the Americans - don't assume all countries have US-style addresses. Just changing "state" to "state/province" and "zip code" to "postal code" is not adequate.
3) For you GIS people, don't assume all countries have a strict US-style administrative boundary hierarchy. Even the US doesn't have the administrative boundaries you think it does.
4) Again for your GIS people, please review point 1 above.
5) For you "data scientists", please review point 1 above. Don't just change your original data to make it ASCII. And if you do, don't give the result to the GIS people to geocode.
6) For other "researchers", please don't refer to these characters are "junk" or "garbage data" in meetings or presentations. You are offending both non-English speakers and people who speak both English and Unicode.

 

They should either support RTL-layout from day 1 or not support it at all! When you use a program and then an update comes out that supports RTL-layout, it's just the worst! Everything you used to click on the right is now on the left and the opposite! We prefer to have LTR-layout then getting used to a new one!

 
  1. Assume that everyone is using imperial system.
  2. Assume that everyone is in the same timezone - this applies to all devs.
 

1.) They forget that non-QWERTY keyboard layouts exist

2.) RTL-languages usually break the UI completely or at least look very out of place

3.) They forget that some cultures don't use , for thousands or . for decimals.

 
 

Assume first and last names are at least 3 letters or more when validating form inputs.
-Bo

 

Assuming adding Internationalization is mostly a technical problem.

"[Django|Rails|Whatever] templates support i18n, so we're basically done"

Many of the problems in this thread (ignorance of LTR/RTL, address forms, weird validation rules) have their origin in this frame of thought, imho.

And this is not limited to americans. For all intends and purposes, german seems to be english with longer words and weird decimals. Even our umlauts and the ß are on the latin1. We even have federal states to in the address fields, if they're not selectboxes ;)

 

They forget that in some languages most words are longer than in English

 
 

They forget to do it, until they try to take it into other countries.

 

Well, platform vendors make it hard sometimes. It's not that straight forward to change app local in-App only (talking about Android, iOS)

 

I don't know about Android, but it's not that hard to override on iOS. You literally write a valid iOS language code to a dictionary.

 

I deleted my comment by mistake! I saw it duplicated, when I deleted the second one, the first one with your reply was deleted as well :/
Anyway, I know it's provided on both platforms, but, IMHO, it's not straight forward.

Classic DEV Post from May 29

Is generalization killing creativity in the software industry?

As software gets more and more integrated into our lives, the industrialization of its crafting process becomes inevitable. But the over-generalization of software engineering can be crushing the creative side of programming.

Ben Halpern profile image
A Canadian software developer who thinks he’s funny.