Google Translate: Ten Years Later

translate

I remember when Google Translate went live. Hard to believe it was 10 years ago.

I remember thinking that this relatively new technology, known as Statistical Machine Translation (SMT), was going to change everything.

At the time, many within the translation community were dismissive of Google Translate. Some viewed it as a passing phase. Very few people said that machine translation would ever amount to much more than a novelty.

But I wasn’t convinced that this was a novelty. As I wrote in 2007 I believed that the technologists were taking over the translation industry:

SMT is not by itself going to disrupt the translation industry. But SMT, along with early adopter clients (by way of the Translation Automation Users Society), and the efforts of Google, are likely to change this industry in ways we can’t fully grasp right now.

Here’s a screen grab of Google Translate from 2006, back when Chinese, Japanese, Korean and Arabic were still in BETA:

google_translate_May2006

Growth in languages came in spurts, but roughly at a pace of 10 languages per year.

google_translate_growth

And here is a screen grab today:

google_translate_May2016

 

Google Translate has some impressive accomplishments to celebrate:

  • 103 languages supported
  • 100 billion words translated per day
  • 500 million users around the world
  • Most common translations are between English and Spanish, Arabic, Russian, Portuguese and Indonesian
  • Brazilians are the heaviest users of Google Translate
  • 3.5 million people have made 90 million contributions through the Google Translate Community

 

The success of Google Translate illustrates that we will readily accept poor to average translations versus no translations at all. 

To be clear, I’m not advocating that companies use machine translation exclusively. Machine translation can go from utilitarian to ugly when it comes to asking someone to purchase something. If anything, machine translation has shown to millions of people just how valuable professional translators truly are. 

But professional translators simply cannot translate 100 billion words per day.

Many large companies now use machine translation, some translating several billion words per month.

Companies like Intel, Microsoft, Autodesk, and Adobe now offer consumer-facing machine translation engines. Many other companies are certain to follow.

Google’s investment in languages and machine translation has been a key ingredient to its consistent position as the best global website according to the annual Report Card.

Google Translate has taken translation “to the people.” It has opened doors and eyes and raised language expectations around the world.

I’m looking forward to the next 10 years.

In search of a better translation icon

A few years ago I wrote about the translation icon and its many variations at that point in time.

I thought now would be a good time to revisit this icon.

Let’s start with the Google Translate. This icon has not changed in substance over the years but it has been streamlined a great deal.

Here is the icon used for its app:

google-translate-icon

Microsoft uses a similar icon across its website, apps, and APIs:

microsoft_translate

I’m not a fan of this icon, despite how prevalent it has become.

Before I go into why exactly, here is another app icon I came across:

another-translate-icon

These first three icons display specific language pairs, which could be interpreted as showing preference for a given language pair. This is the issue that I find problematic.

Why can’t a translate icon be language agnostic?

Here is how SDL approaches the translation icon:

sdl_translation

Although the icon is busy, I’m partial to what SDL is doing here — as this icon does not display a given script pair.

Here is another icon, from the iTranslate app:

iTranslate_app

The counter-argument to a globe icon is this: It is used EVERYWHERE. And this is true. Facebook, for example, uses the globe icon for notifications, which I’ve never understood. Nevertheless, the globe icon can successfully deliver different messages depending on context. In the context of a mobile app icon, I think a globe icon works perfectly well.

 

So the larger question here is whether or not a language pair is required to communicate “translation.” 

Google and Microsoft certainly believe that a language pair is required, which is where we stand right now. I’d love to see this change. I think we can do better.

The humans behind machine translation

Google Translate is the world’s most popular machine translation tool.

And, despite predictions by many experts in the translation industry, the quality of Google Translate has improved nicely over the past decade. Not so good that professional translators are in any danger of losing work, but good enough that many of these translators will use Google Translate to do a first pass on their translation jobs.

But even the best machine translation software can only go so far on its own. Eventually humans need to assist.

Google has historically been averse to any solution that required lots and lots of in-person human input — unless these humans could interact virtually with the software.

Behind Google’s machine translation software are humans.

In the early days of Google Translate, there were very few humans involved. The feature that identified languages based on a small snippet of text was in fact developed by one employee as his 20% project.

Google Translate is a statistical machine translation engine, which means it relies on algorithms that digest millions of translated language pairs. These algorithms, over time, have greatly improved the quality of Google Translate.

But algorithms can only take machine translation so far.

Eventually humans must give these algorithms a little help.

Google Translate Community

So it’s worth mentioning that Google relies on “translate-a-thons”  to recruit people to help improve the quality.

According to Google, more than 100 of these events have been held resulting in addtion of more than 10 million words:

It’s made a huge difference. The quality of Bengali translations are now twice as good as they were before human review. While in Thailand, Google Translate learned more Thai in seven days with the help of volunteers than in all of 2014.

Of course, Google has long relied on a virtual community of users to help improve translation and search results. But actual in-person events is a relatively new level of outreach for the company — and I’m glad to see it.

This type of outreach will keep Google Translate on the forefront in the MT race.

If you want to get involved, join Google’s Translate Community.