Facebook AI is introducing, M2M-100 the first multilingual machine translation (MMT) model that translates between any pair of 100 languages without relying on English data.
Previously the models relied on English data because it was easily available. When translating, say, Chinese to French, previous best multilingual models train on Chinese to English and English to French, because English training data is the most widely available. M2M-100 model directly trains on Chinese to French data to better preserve meaning. It outperforms English-centric systems by 10 points on the widely used BLEU metric which is an NLP benchmark for evaluating machine translations.
M2M-100 is trained on a total of 2,200 language directions — or 10x more than previous best, English-centric multilingual models. Deploying M2M-100 will improve the quality of translations for billions of people, especially those who speak low-resource languages.
You can read about the details here.