Will Google Kill the Translation Industry?

Last week I received a “Factory Tour” invite from Google but didn’t give it much thought. I wish I had because I missed a preview of the company’s ambitious machine translation (MT) efforts.

Thankfully, Philipp Lenssen includes a great recap of the Webcast at this site: Google Blogoscoped. It’s worth a read.

Apparently Google is taking massive libraries of source and target text and dumping them into a database where the relationships between source and target text are analyzed and memorized. This database is then leveraged to translate new source text. Philipp explains it better than I…

This is the Rosetta Stone approach of translation. Let’s take a simple example: if a book is titled “Thus Spoke Zarathustra” in English, and the German title is “Also sprach Zarathustra”, the system can begin to understand that “thus spoke” can be translated with “also sprach”. (This approach would even work for metaphors surely, Google researchers will take the longest available phrase which has high statistical matches across different works.) All it needs is someone to feed the system the two books and to teach it the two are translations from language A to language B, and the translator can create what Franz Och called a “language model.” I suspect it’s crucial that the body of text is immensely large, or else the system in its task of translating would stumble upon too many unlearned phrases. Google used the United Nations Documents to train their machine, and all in fed 200 billion words. This is brute force AI, if you want —  it works on statistical learning theory only and has not much real “understanding” of anything but patterns.

This sure is brute force MT. I’ll be very interested to know just how long a string a text Google can effectively translate. More important, how will Google handle the flood of brand names, oddball terms, and local slang?

But let’s just assume that Google does make this ambitious project a success; how will this affect the translation industry in general and Web globalization in particular?

Assuming this all does work moderately well, companies will be incented to pull all text out of graphics to make the most of this free translation service. After all, if Google is providing users in Vietnam a free translation of your company’s Web site, why not do what you can to make everything translatable.

This would also be yet another blow to Macromedia Flash, not that the emergence of AJAX isn’t doing enough damage.

But what about the impact on translation vendors? i don’t think they have much to worry about, yet. The need for high-quality, human-edited translation isn’t going away anytime soon. Long term, however, all bets are off. Google should be on every translation vendor’s radar; this company has lots of money, lots of smarts, and lots of incentive to provide the world’s text in all the world’s languages.

Is Google Losing Its Global Edge?

Suw Charman says “I still think there’s a fundamental mental block regarding the rest of the world from a lot of American companies and developers.” She points out the new Google Maps application and its glaring lack of any country besides the US of A.

I was thinking the same thing myself the other day when I first tried it out.

escondido.jpg

First, I looked up my home, as I imagine most people do, then I scrolled west and west and west, thinking “Shouldn’t Japan be coming up pretty soon?” But it didn’t, just lots of blue water…
maps_google.jpg

In Google’s defense, I’ll assume this is another one of their “beta” projects. It is pretty nifty and I do hope other countries are forthcoming,

Google Is Best Global Web Site (Again)

All Web sites are, by default, global. But which Web sites do the best job of truly speaking to the world? That is, which Web sites support the most languages, make navigation effortless for non-English speakers, and provide Web users around the world with fast-loading Web pages?

These are the questions I began asking a few years ago when my firm produced the first report on this topic, The Web Globalization Report Card. We studied 121 Web sites, ranging from Amazon to GE to Sony.

Google emerged as the best site overall.

Yesterday we published the 2005 Web Globalization Report Card and, sure enough, Google is tops once again. Frankly, I wasn’t surprised to see Google at the top of the list. It’s not a perfect Web site, but it does a great many things right — from providing users around the world with a fast-loading Web page (much faster than Yahoo!) to using a consistent, global interface to supporting 97 different languages. As I’ve said before, Google is arguably the most global commercial Web site yet developed.

But it is not the only successful global Web site out there. Here are the top 10 Web sites:

1. Google
2. HP
3. American Express
4. Philips
5. Skype
6. Ericsson
7. Procter & Gamble
8. Cisco Systems
9. IBM
10. E*TRADE

Companies like Wal-Mart, Coca-Cola, Qualcomm and Disney did not fair so well. All finished near the bottom of our rankings. Being a global company or having a global brand does not ensure a successful global Web site.

If your company is planning to dive into the Web globalization waters, I encourage you to take the time to review these 10 Web sites.

A Closer Look at Google’s Global Trajectory

Google says that half of its Internet traffic emanates from outside the US. While this is significant, what really matters to Google is where the revenues emanate from.

Now that Google is on the verge of going public, it has finally coughed up some numbers. In 2003, roughly 25% of Google’s revenues came from outside the US, shown here:

google_revenues_net.jpg

Judging by 2004 numbers thus far, I would predict that international revenues will surpass US revenues by Q1 of 2006. This trend becomes more apparent when you view geographic revenues as percentages of the whole, shown below:

google_revenues_geo.jpg

It’s not hard to see the international column surpassing the US column fairly quickly. As I’ve written before, Google is probably the most global commercial Web site ever created; it offers more than 90 localized Web sites. Every one of these sites is a potential source of advertising revenue. So it is not a question of if international revenues will surpass US revenues, but when.

Googling China

The search engine war in China has long been heated, but Yahoo! recently upped the stakes with the launch of a new search portal: www.yisou.com.

yahoo_china_yisou.jpg

It sure looks a lot like Google’s search portal, underscoring the dramatic success Google has enjoyed in this market over the past few years.

Consider these impressive stats from The Miami Herald:

China is currently second to the United States in Internet users (at 80 million in 2003 compared to our 185 million) but will surpass the United States within five years, according to Forbes Global. On any given day, nine of the world’s 25 busiest websites are situated in China. Yahoo! and eBay are coming on strong in competition with locally entrenched portals. Even without China-based offices, Google attracts 40 percent of China’s search users.

Clearly, the search portal that wins in China will have the lead in users globally. While Google has the lead today, I suspect that an entirely new search engine, likely based in China itself, may be that leader five years from now.

The Globalization of Google

Amy Campbell alerted me to a very interesting graphic on the Google Zeitgeist page. It tracks the languages used to access Google over the past two years:

google_lang.gif

Google handles more than 200 million queries a day from around the world. Increasingly, these queries are not in English. Over the past few years, Google has aggressively localized its search engine for more than 60 languages. These language-specific search engines are very important to Google’s continued growth, since the majority of new Internet users are not native-English speakers.

Keep a close eye on that tiny purple streak representing Chinese; it’s sure to expand. While there are only about 100 million German speakers in the world, there are well over a billion Chinese speakers. Also expect to see Arabic (200 million speakers) make an entrance in a few years.

Google began in 1998 as an English-language search engine. My, how times — and the Internet — have changed. And, if you’re interested, Google is looking for an International Webmaster.