Updated Sept. 30 at 2pm in Hong Kong with comments from Google.
Google has recently overhauled Google Translate. The new AI-powered translator, “Google Machine Neural Translation,” can cut down errors by 80% compared to its current algorithm, making it nearly identical to human translators, the company said this week. The new method is currently only available for Mandarin to Chinese, but judging from the examples Google gave, it appears to be doing great at translating standard journalistic writing.
Here at Quartz, where language is an obsession, we decided to take matters into our own hands. As we did with Skype’s instant translator, we put Google’s new translator through a realistic Chinese stress test. The highest score available is five light bulbs:
First, we picked up a piece of “daily conversation” intended for beginner English students from ESLfast, a free online language website. We translated the dialogue to Mandarin ourselves, and then put it through Google’s new translator to check how well it converted it back to the original English.
Here’s the original (and yes, it is sexist):
A: I’ve got a date for you. B: Oh, really? A: Are you interested? B: Maybe. What is she like? A: She’s got a great personality. B: Uh-oh. That means that she’s fat and ugly. A: She’s cute. B: OK, so she’s not ugly; she’s just fat. A: She weighs 98 pounds. B: OK, she’s not fat. So what’s the problem with her? A: Who said there is a problem with her? B: The problem is she has no problems—she’s too good for me!
And here’s what Google came up with:
A: I arranged an appointment for you. B: Oh, really? A: Interested? B: Maybe, how about her? A: Her character is super good. B: God, that means she is fat and ugly. A: She is very cute. B: OK, that means she’s not ugly, but fat. A: She’s 98 pounds. B: So it seems she is not fat, that she has any problem? A: Who says she has a problem? B: The problem is that she has no problem, I am not worthy of her.
It did quite well. The only failure is changing the meaning of “What is she like?” to “How about her?” We translated “What is she like” into “她人怎么样” in Mandarin, which in word-for-word translation says “How is she?” but means more “How is her personality?” Because Google did a word-for-word translation to put it back into English, it came up with “How about her?” which doesn’t convey its original meaning anymore.
Stage 1 score: 💡💡💡💡
Then, we took it to the next level, with some more complicated material likely to be used by students.
Here’s an excerpt from a study from the US National Academy of Sciences about the brain size and cognitive power of birds, which might be useful to a zoology student:
Birds are remarkably intelligent, although their brains are small. Corvids and some parrots are capable of cognitive feats comparable to those of great apes. How do birds achieve impressive cognitive prowess with walnut-sized brains? We investigated the cellular composition of the brains of 28 avian species, uncovering a straightforward solution to the puzzle: brains of songbirds and parrots contain very large numbers of neurons, at neuronal densities considerably exceeding those found in mammals. Because these “extra” neurons are predominantly located in the forebrain, large parrots and corvids have the same or greater forebrain neuron counts as monkeys with much larger brains. Avian brains thus have the potential to provide much higher “cognitive power” per unit mass than do mammalian brains.
Quartz translated the paragraph into Chinese and then put it back through Google’s translator:
Although the heads of birds are small, they are very clever. Some crows and parrots can even be cognitively comparable to apes. Why do these birds walnut-sized brain, can have such amazing cognitive ability? We have 28 species of birds brain cell nerve composition investigation, solved the answer to this answer. The answer is straightforward: both the songbird and the parrot have large numbers of neurons in the brain that are denser than the known mammalian neuron density. Coupled with the large number of neurons concentrated in the forebrain, large parrots and songbirds have the same number, or more neurons, than monkeys with much larger heads than these birds. Thus birds have a higher brain specific energy than mammals
Here’s how Google translated the definition of “bubble economy” from a Chinese-language page of Wikipedia:
Bubble economy, refers to the value of assets beyond the real economy can bear the degree, it is easy to lose the sustainable development of the macroeconomic status.
It is arguably still kind of intelligible. But Google is having a problem with sentence structure. Because word order varies a lot in Chinese and English, Google failed to translate the Chinese definition in the right order to make it clear in English. It should be “the asset values that are beyond the degree the real economy can bear” and “a macroeconomic status that is easy to lose sustainability.” For the latter phrase to make even more sense in English, it should say something like “a macroeconomic status that is easily unsustainable.”
Run-on sentences are OK in Chinese, and the subject of a sentence can usually be omitted if it is mentioned in previous ones. Here’s how Google translated a paragraph about Zika’s origin, again from China’s Wikipedia. As the sentences run on, it becomes hard for Google to capture the logic behind them:
The virus first in 1947 in Uganda Zika forest of macaque isolated from the body, hence the name. According to genotypes are divided into Asian and African type two types, in Central Africa, Southeast Asia and India have found records. In the past, only a small number of human cases were reported, until 2007, in the Federated States of Micronesia, Yabu outbreak of cluster epidemic, only this disease have more awareness.
Stage 2 score: 💡💡
With its linguistic, literary, aesthetic, and cultural differences, translating poetry may be the ultimate challenge to any human translator. So how about Google’s AI? We tested it with a poem by Li Bai from the ancient Tang Dynasty (born in 701 AD), “Thoughts on a still night.” It’s simple, at five characters per line. Here’s a decent English translation:
Before my bed, the moon is shining bright,
I think that it is frost upon the ground.
I raise my head and look at the bright moon,
I lower my head and think of home.
And here’s Google’s version:
Moonlight before bed,
Suspected to be frost on the ground.
Raised his head and looked at the moon,
Bow thinking hometown.
“Suspected to be frost on the ground” means the poem loses all its aesthetic feeling. You can tell Google also struggled to find a subject for the third line, so it became “raised his head.”
Stage 3 sore: 💡💡
Google can translate some basic Chinese swear words correctly into English, like “f—” and “f— your mother.” But it failed to capture the more colloquial phrase 我操, which literally means ”I f—,” but in use is equivalent to “f— you” in English. Here, Google offered up some scrubbed translations instead, which could be dangerous for anyone who mistakenly uses them.
A euphemism for “I f—” in Chinese was also sanitized:
Bonus round score: 💡
A Google spokesman, who responded to questions after the story was published, said “While it was nice of Quartz to say it was ‘nearly identical to human translators,’ it’s not a comparison we’d make or made. What we would say is that this neural network incredibly rapidly made sense of millions of Chinese sentences that the old system had garbled.” No, the spokesman said, it does not get poetry, but it had “suddenly made Google Translate get millions of sentences right that the previous system garbled.”
An earlier version of this article mistakenly said that Google’s new translator also translated English to Chinese. It only works on Chinese to English right now.