Amazon announced earlier this week that it had made its home-grown Amazon Translate service generally available.
Like other Amazon Web Services (AWS), you can leverage the service across websites, apps, as well as text to speech. I should stress that this is a “neural” machine translation service — which has proven surprisingly effective at getting more natural sounding over time. Google and others are also investing heavily in neural MT.
And you can give it a free test drive; according to AWS, the “first The first 2 million characters in each monthly cycle will be free for the first 12 months starting the day you first use the service.”
The major limitation right now languages: It supports English into just six languages, which feels rather retro compared to MT services from SDL and Google. Google Translate, by comparison, is 12 years old and supports 100+ languages (of varying degrees of quality).
And more languages are coming. I can’t comment on the quality of the translation but would love to hear what others have experienced so far.
As I’ve noted in the 2018 Web Globalization Report Card, machine translation continues to gain fans among global brands — not just internally but externally. That is, visitors to websites can self-translate content themselves — a feature I have long recommended for a number of reasons.
So it’s great to see another machine translation service available at scale for organizations of all sizes.
PS: Interesting to see a recommendation from Lionbridge on the home page — a happy Amazon Translate client.
I’m excited to announce the publication of The 2018 Web Globalization Report Card. This is the most ambitious report I’ve written so far and it sheds light on a number of new and established best practices in website globalization.
First, here are the top-scoring websites from the report:
For regular readers of this blog, you’ll notice that Google was unseated this year by Wikipedia. Wikipedia, with support for an amazing 298 languages, made a positive improvement to global navigation over the past year that pushed it into the top spot. And Wikipedia, due to the fact that it is completely user-supported, indicates that there is great demand for languages on the Internet — and very few companies have yet responded in kind.
Google could still stand to improve in global navigation, as could Facebook.
Other highlights from the top 25 list include:
Consumer goods companies such as Pampers and Nestlé are a positive sign that non-tech companies are making positive strides in improving their website globalization skills.
As a group, the top 25 websites support an average of more than 80 languages (up from 54 last year); but note that we added a few websites that made a big impact on that average.
Luxury brands such as Gucci and Ralph Lauren continue to lag in web globalization — from poor support for languages to inadequate localization.
The average number of languages supported by all 150 global brands is now 32.
The data underlying the Report Card is based on studying the leading global brands and world’s largest companies — 150 companies across more than 20 industry sectors. I began tracking many of the companies included in this report more than a decade ago and am happy to share insights into what works and what doesn’t.
I’ll have much more to share in the weeks and months ahead. If you have any questions about the report, please let me know.
Congratulations to the top 25 companies and to the people within these companies who have long championed web globalization.
I remember when Google Translate went live. Hard to believe it was 10 years ago.
I remember thinking that this relatively new technology, known as Statistical Machine Translation (SMT), was going to change everything.
At the time, many within the translation community were dismissive of Google Translate. Some viewed it as a passing phase. Very few people said that machine translation would ever amount to much more than a novelty.
But I wasn’t convinced that this was a novelty. As I wrote in 2007 I believed that the technologists were taking over the translation industry:
SMT is not by itself going to disrupt the translation industry. But SMT, along with early adopter clients (by way of the Translation Automation Users Society), and the efforts of Google, are likely to change this industry in ways we can’t fully grasp right now.
Here’s a screen grab of Google Translate from 2006, back when Chinese, Japanese, Korean and Arabic were still in BETA:
Growth in languages came in spurts, but roughly at a pace of 10 languages per year.
Most common translations are between English and Spanish, Arabic, Russian, Portuguese and Indonesian
Brazilians are the heaviest users of Google Translate
3.5 million people have made 90 million contributions through the Google Translate Community
The success of Google Translate illustrates that we will readily accept poor to average translations versus no translations at all.
To be clear, I’m not advocating that companies use machine translation exclusively. Machine translation can go from utilitarian to ugly when it comes to asking someone to purchase something. If anything, machine translation has shown to millions of people just how valuable professional translators truly are.
But professional translators simply cannot translate 100 billion words per day.
Many large companies now use machine translation, some translating several billion words per month.
Companies like Intel, Microsoft, Autodesk, and Adobe now offer consumer-facing machine translation engines. Many other companies are certain to follow.
Google’s investment in languages and machine translation has been a key ingredient to its consistent position as the best global website according to the annual Report Card.
Google Translate has taken translation “to the people.” It has opened doors and eyes and raised language expectations around the world.
It has been a decade since Google Translate took machine translation to the masses — a topic for a future post.
But most companies will not be using Google Translate anytime soon to power their machine translation efforts. They want more control over customizing the engine, leveraging existing translation memories, and other capabilities that Google doesn’t yet offer. So they turn to vendors such as Microsoft, SDL, and SYSTRAN, a company that pioneered machine translation decades ago.
SYSTRAN was acquired by a Korean machine translation company in 2014 and earlier this year launched an online machine translation platform called SYSTRAN.io. This platform allows companies to leverage machine translation (and other services) via API. In other words, you don’t have to purchase an expensive enterprise license or host any software — you just connect your software to SYSTRAN’s engine. And, perhaps best of all, SYSTRAN has allowed anyone to take a free test drive of roughly a million characters of translation per month.
To learn more, here’s a Q&A I recently conducted with the company:
What are the benefits/solutions that SYSTRAN.io provides?
SYSTRAN.io allows software developers, customer experience (CX) companies, multi-national marketing departments, social media and marketing technology companies, and online gaming developers to access the same software to develop multilingual applications that were once only available to large, international companies.
How many language pairs are supported?
There are up to 50 languages supported, depending on the particular module.
What is the most popular usage model (so far) for SYSTRAN.io? In terms of volume of user queries:
So far, the most popular usage is for language translation on mobile devices.
In terms of numbers of solutions built:
Language translation within customer support forums is strong because companies and customer-support, software-as-a-service agencies can translate large numbers of documents in their FAQ knowledge base. This helps decrease call volume (the highest operational cost of customer support) and increase customer satisfaction scores because users can find their answers faster.
How do you leverage the platform to conduct “sentiment analysis” of user-generated content?
The number of available media (social media, review sites, blogs, support forums) as part of the user experience are growing everyday, companies are receiving unstructured commentary across these platforms and in many different languages. Developers using a combination of SYSTRAN.io’s modules will enable that content – across multiple languages – to be mined for information in any language, for positive or negative comments, and then can categorize those comments and generate responses in the language of the user. Imagine 50,000 comments, where 20,000 rank negative, but 500 are extremely negative and defacing, Those are the ones you want to reply to first.
For example, with the Olympics coming up, imagine a brand is sponsoring an athlete and he gets caught the night before the big race for using enhancement drugs or for cheating on his wife – it hits social media fast. How do you respond if you don’t know about it because the comments are in multiple languages? Or, on the opposite end of the spectrum, imagine many fans see an athlete wearing a particular shoe or clothing item and they want to know where they can get it – and they are asking on twitter. Right now, there are many eCommerce sites and marketing agencies that are “listening” for those tweets in multiple languages and selling to customers online. Systran.io can make it easier for developers to make apps that listen in multiple languages and then respond in those same languages.
Explain “anonymization” as a feature of your service
Because of laws such as the safe harbor act, law firms, financial publishers, and many other multi-national firms are required to remove “personal information” such as names, address, and social security numbers from any information they send overseas to their counterparts at another office. In this scenario, companies need to remove this information or “anonomyze” it from the large data set. Send a different packet or code with the personal information and their team mates can receive and assemble the data safely.
How is SYSTRAN.io different than SYSTRAN’s Enterprise platform?
SYSTRAN.io is based on the same language translation and NLP technology that powers SYSTRAN’s enterprise offering used by Symantec, Cisco, Airbus, Ford, Toyota, BNP Paribas, Daimler, Barclays, defense and security organizations such as the U.S. intelligence community, NATO, Interpol and language service providers. It is equally robust, but the security responsibility falls on the developer of the particular application for anything beyond what is already built-in for the SYSTRAN.io aspect. The enterprise server, on the other hand, offers increased security as it can be installed behind an organization’s firewall. Also, the enterprise server offers 130 language pairs.
How does SYSTRAN.io compete against existing web service offerings from competitors?
We don’t know of any pure language technology companies that are offering free usage of multilingual development APIs to developers, do you? We’ve seen technology companies attempt to enter the language translation technology space but they do not have the content necessary to accomplish viable translations. Language translation technology is easy to talk about but extremely difficult to accomplish.
For SYSTRAN, this has been our 100 percent focus for nearly half a century. Now we are opening those decades of language translation content to developers. These databases have been contributed from linguistic and intelligence knowledge workers who have compiled learnings and optimizations from trillions of translations served dating back to our first client – the US Air Force in 1968 during the Cold War – to today. Our translation databases are deep and robust.
I believe you are offering a million characters of translation for free per month – is that correct? How long will this offer last?
That is correct. Once you sign up you have a million free characters (plus free usage levels for the other API’s) per month, every month; we want to encourage people to use our tools and not burden them with a cost at the development stage. The end date is open for now.