By Edward Nawotka
The New York Times covers Google’s translation technology, which now can translate into 52 different languages. Taking advantage of its immense computing power, the company fed its computers a few hundred billion English words — as compared with an average of one billion words for others — and has far surpassed most other attempts at automated translation.
From the piece:
Creating a translation machine has long been seen as one of the toughest challenges in artificial intelligence. For decades, computer scientists tried using a rules-based approach — teaching the computer the linguistic rules of two languages and giving it the necessary dictionaries.
But in the mid-1990s, researchers began favoring a so-called statistical approach. They found that if they fed the computer thousands or millions of passages and their human-generated translations, it could learn to make accurate guesses about how to translate new texts.
Like its rivals in the field, most notably Microsoft and I.B.M., Google has fed its translation engine with transcripts of United Nations proceedings, which are translated by humans into six languages, and those of the European Parliament, which are translated into 23. This raw material is used to train systems for the most common languages.
But Google has scoured the text of the Web, as well as data from its book scanning project and other sources, to move beyond those languages. For more obscure languages, it has released a “tool kit” that helps users with translations and then adds those texts to its database.
Smart and scary. Still, one must assume that the human-generated translation is better. But, like Kasparov playing a computer at chess, how long will the humans have the advantage? Could Google-assisted translation speed up the number of books published in English, particularly of non-fiction, where writing style tends to require less artistry? If Google gets things right, could they eventually put translators out of business?
Read the full story here.