Massively multilingual

It seems like only yesterday (2009 actually) when Google Translate had reached 50 languages.

And for the past few years it appeared that Google has largely plateaued at the low 100s. It currently supports 133 languages, having most recently added Dogri, Ewe and Bambara (and, yes, these languages are new to me as well).

Make no mistake, 133 is a lot of languages. Most corporate websites are mired below 10 languages. And even the best global websites support an average of just about 34 languages.

Which is why this recent development from the translation team at Google is big news. According to The Verge:

Google has announced an ambitious new project to develop a single AI language model that supports the world’s “1,000 most spoken languages.” As a first step towards this goal, the company is unveiling an AI model trained on over 400 languages, which it describes as “the largest language coverage seen in a speech model today.”

At 400 languages, you’re roughly on par with Wikipedia (no doubt a benchmark). See: Wikipedia and the internet language chasm.

When will we see Google Translate at 400 languages? Hopefully soon.

Language access is one of the great barriers to the Internet. I realize machine translation can be messy and sometimes laughable. But I’ll take messy over nothing at all and I suspect I’m not alone.

(Visited 58 times, 1 visits today)