Quoc Le is no stranger to the indignity of translation. Whenever the Google research scientist in Mountain View, California, visits his native Vietnam, he laughs with his parents over mistranslations in the very system he is helping shape, the 10-year-old online service Google Translate. Most errors are tiny—not important enough to remember. But together, they tell a larger story: “Translation is not a solved problem,” he says. Right now, it’s less about finding the perfect one-to-one translation and more about “avoiding embarrassment.”
But that may all change soon. Today, Quoc and his colleagues at Google rolled out a new translation system that uses massive amounts of data and increased processing power to build more accurate translations. The new system, a deep learning model known as neural machine translation, effectively trains itself—and reduces translation errors by up to 87%. “This … demonstrates like never before the power of neural machine translation,” says Yoshua Bengio, a computer scientist at the University of Montreal in Canada, who helped invent one of the critical components of the new system several years ago, but who was not involved in the current work.
Neural machine translation has been a latecomer to the game of deep learning, a method of making predictions about everything from effective marketing pitches to potential drug candidates. This happens by feeding large sets of data through layers of interconnected processors. The processors—modeled after the brain’s networks of neurons—are first trained by humans on actual translations and then let loose on new sets of data. Well-calibrated processors can pick up on subtle cues in the data, transform them, and send them to the next level for further processing and translation. Deep learning is what allows Apple’s “personal assistant,” Siri, to pick up on (most) human speech, and it’s what lets Facebook’s image recognition software use small visual cues to pick out things like individual faces.
But many people, Quoc says, think that translating language requires deeper cognitive abilities. “For instance, it takes us a fraction of a second to recognize an image or understand … audio. But it takes more than 1 second even for me to translate an English sentence to Chinese.”
So for years, most automated translators were stuck with a system known as phrase-based translation. Like neural machine translation, phrase-based translation needs a large set of training data before it’s ready to roll. Once it’s up and running, the system divides sentences into phrases, which it translates individually. Then, the whole string of phrases has to go through another layer of processing to ensure the correct word order. Quality is variable. “There’s something wrong about it,” Quoc says. He finds that the grammar is often incorrect or words are mistranslated—sometimes wildly. “It makes people laugh.”
The new method, reported today on the preprint server arXiv, uses a total of 16 processors to first transform words into a value known as a vector. What is a vector? “We don’t know exactly,” Quoc says. But it represents how related one word is to every other word in the vast dictionary of training materials (2.5 billion sentence pairs for English and French; 500 million for English and Chinese). For example, “dog” is more closely related to “cat” than “car,” and the name “Barack Obama” is more closely related to “Hillary Clinton” than the name for the country “Vietnam.” The system uses vectors from the input language to come up with a list of possible translations that are ranked based on their probability of occurrence.
Other features include a system of cross-checks that further increases accuracy and a special set of computations that speeds up processing time.
When compared with Google’s previous system, the neural machine translation system scores well with human reviewers. It was 58% more accurate at translating English into Chinese, and 87% more accurate at translating English into Spanish (see table, below). As a result, the company is planning to slowly replace the system underlying all of its translation work—one language at a time. Today, Google Translate will start using the system for Chinese to English translations, in part because it’s a notoriously difficult language pair, Quoc says. The other reason? Many of the researchers on his team are Chinese.
“It is quite amusing to see how fast a new research development is being transferred to industry and adopted in a product,” says New York University in New York City computer and data scientist Kyunghyun Cho, who was not part of the new work. “This general trend of fast transfer from research to production is one of the core strengths of deep learning, and perhaps the reason why industry is heavily investing in it.”
But several researchers commented that the new system doesn’t represent a scientific so much as an engineering breakthrough. “A lot of the inspiration for this paper came from [the fields of] speech and computer vision,” says Thang Luong, a graduate at Stanford University in Palo Alto, California, who has built neural machine translation systems for Google in the past. “It’s a synthesis of many years of work.” Neural networks have been used for machine translation since at least 2010, and other features of the system have been employed in other models in the last several years. But this is the first time that any one group has deployed all of these advances together.
That means more than just corporate kudos for Quoc. “I can’t wait to see how happy my parents are going to be.”
*Correction, 30 September, 10:25 a.m.: This article has been updated to reflect that advances in the field of computer vision helped lead to the new system.