iPhone app globalization: Ready for take-off

The WSJ has an article about iPhone developers taking their apps global.

It’s very early days, but it’s safe to say that localization vendors are drooling over the possibilities. Although many apps aren’t going to present much in the way of translation revenue, the localization engineering work can be quite substantial.

I’m currently aware of two vendors that have been doing a good job of specializing in this area:

Some app developers I’ve spoken with still question the degree to which they must localize their apps. After all, many report significant sales in markets around the word WITHOUT any localization investment on their part. So they naturally want to know what additional sales they’re going to get for their investment. There are many factors to consider. The ROI of a 99 cent app could be tough to achieve if you’ve got to completely internationalize your app. If your app is already internationalized, the ROI is much easier to achieve.

But China and Japan, as noted in the WSJ article, could be what pushes more and more developers into finally opening their checkbooks.

Here’s what one iPhone developer says:

“We definitely have plans to get all our games localized,” said Andrew Stein, PopCap’s director of mobile business development. “We may see more than half of our sales come from outside of the U.S.” PopCap’s $2.99 “Plants vs. Zombies” tower defense game is currently No. 1 in China, according to App Store rankings.

The article stresses that few apps are currently localized — and I will second that. In fact, the only apps that I’m aware of that support more than 20 languages are Apple’s own default apps. Outside of Apple, PayPal and Google apps appear to be the most global overall.

Here’s a rough tally of what I’ve seen so far:

  • PayPal Mobile: 15 languages
  • Google Mobile: 15 languages
  • Facebook Mobile: 7 languages
  • Monopoly: 6 languages

What am I missing here?

When machine translation and volunteer translators collide: A YouTube/TED case study

Google recently announced a rather nifty feature in YouTube: Auto-translation of auto-generated video captions.

So not only is Google automatically transcribing the text of its videos, it’s also providing translations — via machine translation. Now I just need a “machine reader” so I can process all of this new content — as I’m running out of hours in a day.

Google’s blog post notes:

In the next few months we expect over 150,000 Youtube channels to implement auto-captioning with translation. This is just the beginning and we hope that all Youtube content will soon be enjoyed by all Youtube users, regardless of what language they speak.

One of the examples cited is a TED talk by author Elizabeth Gilbert, show here:

Here’s how you enable the auto-translation — hover your mouse over the Closed Caption icon and click the Translate Captions link.

I found the language-selection overlay (shown below) challenging to scroll through. But I suspect this feature will be automated eventually, similar to how Google’s Chrome browser has automated translation based on your language setting.

What I find interesting about the Gilbert talk is that TED has recruited its own army of translators — human translators — to do the same thing but in higher quality.

Here is the TED-translated version of the same talk:

I think it’s safe to assume that the volunteers are going to offer a much higher-quality translation of the video. But TED does not (yet) support the breadth of languages that Google supports. So while TED has the advantage in quality, Google has the advantage in languages.

But the larger is to what extent Google will make the TED-translated video as easy to find as its own YouTube version.

I did a Google search today and both videos emerged at the top of the results:

I believe this scenario raises a few interesting issues that will need to be addressed in the years ahead:

  • How to easily differentiate between content that has been machine translated vs. human translated
  • How to quickly discover which content is available in which languages
  • Will the crowd continue to be as enthused about translating content by hand when Google  provides the same service, albeit in lower quality, for free?

Is Google the best machine translation engine? It depends…

Two weeks ago, I introduced Ethan Shen and his project to analyze the three major free machine translation (MT) engines — Google, Microsoft, and Yahoo! Babelfish — by relying on translator reviews.

Ethan has provided me with a mid-point summary of results, which I’ve included below. I was surprised to find that Microsoft and Babelfish are beating Google on some languages pairs, as well as on shorter text strings. Although Google is emerging the overall winner — and receiving some much-deserved attention from the media — it’s nice to see some healthy competition.

That said, quality is only one piece of the puzzle. The other piece — perhaps much more important — is usability. Now that Google has embedded its MT engine into Gmail and Reader — and now its Chrome client –I find I’m using Google exclusively as my MT engine.

Here are Ethan’s findings so far (emphasis mine):

At the highest level, it appears that survey participants prefer Google Translate’s results across the board.

In a few languages (Arabic, Polish, Dutch) the preference is overwhelming with votes for Google doubling its nearest competitor

However, once you remove voters that have self defined their fluency in the source or target language as “limited,” the contest becomes closer along some of the heavily trafficked languages. For example:

  • Microsoft Bing Translator leads in German
  • Yahoo! Babelfish leads in Chinese
  • Google maintains its lead in Spanish, Japanese, and French

Observing only the self-defined “limited fluency” voter reveals a strong brand bias. If your fluency in the target translation language is limited, it would stand to reason your ability to assess the quality of the translation is very limited. And yet…

  • Limited-fluency voters chose Google over Bing by 2 to 1
  • They also chose Google over Yahoo! Babelfish by 5 to 1

As I had guessed, Yahoo! and Microsoft’s hybrid rules-based MT model performed better on shorter text passages

For phrases below 50 characters, Google’s lead in Spanish, Japanese, and French disappear. And Microsoft’s lead in German widens.

Beyond 50 characters, Google’s relative performance seems to improve across the board.

For passages that are only one sentence, the same effect is seen, though to a lesser extent than under 50 characters.

On March 4th, we made a few changes to our survey – hiding the brands and randomizing the positions of the text results before voting.  Since then, we have not yet collected enough data to draw conclusions, but Babelfish seems to be receiving the biggest boost, perhaps showing the effects of the recent neglect of that tool.

Clearly, Ethan needs more data to arrive at more concrete conclusions. If you’re a translator and you want to lend a hand, here is the voting site.

PS: Here’s an interview with Google’s MT guru Franz Josef Och.