In the August issue of Global By Design, in the article Machine Translation: The Next Generation, I introduced statistical machine translation (SMT):
SMT is a data-driven translation technology. Rather than relying on a dictionary of translations and rules, it starts with data in the form of lots and lots of source and target text. The statistical process involves analyzing this data and identifying patterns. By analyzing millions and millions of words, the software gets pretty good at “guessing” how to translate a given text string. “We’re not really translating,” said Language Weaver CEO Bryce Benjamin. “What we’re really doing is a probability forecast.”
Language Weaver has been one of the pioneers in SMT but has focused only on the government sector primarily serving intelligence agencies.
Language Weaver this week launched the “Customizer” and targeted it at large enterprises and government bodies. What makes this tool so unique is that a company can very quickly adapt it to its specific industry and the software will continue to improve in quality as more translations are processed.
According to Bryce Benjamin, “The Customizer allows each customer to create, within just a few hours, a unique set of translation engines that cannot be duplicated by anybody else without access to the same data resources.”
However, the Customizer is not for everyone just yet. For starters, the software currently only supports the following language pairs:
-> French English
-> Spanish English
-> Arabic English
-> Chinese English
-> Hindi -> English
-> Somali -> English
The other two obstacles are pricing and the minimum database of translated content required to get started. For a large enterprise, these obstacles are easily overcome but small businesses will need to wait until a low-end product is launched, or until Google launches its free SMT product, possibly as early as 2006.
I’m glad to see Language Weaver going after enterprises and I think they will find takers, though a good deal of education will be required. Machine translation is still widely viewed as not-ready-for-prime-time technology. I do believe that SMT, over time, will be a very positive development for Web globalization, helping companies publish a great deal more content for local markets, increasing sales, and better serving customers.
I’ll have more to say on Language Weaver in the November issue of Global By Design, due out later this week.
PS: Here’s another interesting article on the next wave of machine translation.