We’re thinking about the Turing Test all wrong

In 1950, five years before computer scientist John McCarthy would coin the term “artificial intelligence,” mathematician Alan Turing famously posited: Can machines think?

In 1950, five years before computer scientist John McCarthy would coin the term “artificial intelligence,” mathematician Alan Turing famously posited: Can machines think?

To answer, Turing devised a simple test. Known as the “Imitation Game,” the machine passes as “intelligent” if, during a text-based chat, it can fool us that it’s human. Since then, his eponymous Turing Test has inspired countless competitions, fierce philosophical debates, media frenzies, and epic sci-fi plots from Westworld to Ex Machina to Her—not to mention copious criticism from academia.

But Turing Test detractors who believe that “winning” the Imitation Game has “little practical significance for artificial intelligence” are missing the finer point contained in Turing’s premise:

That the fundamental defining feature of human intelligence is language.

Turing and Aristotle were bedfellows in this regard. As observed in the Greek philosopher’s tome The Politics, “Man alone of the animals possesses speech.” Words, in other words, are key to what makes us special—and our intellect unique—relative to other species.

… and to software.

Today, as chatbots proliferate and the tech giants’ digital assistants continue to infiltrate our smartphones, vehicles, and homes, we are increasingly chatting with machines. They are, however, still struggling to sensibly talk back. (Just ask Siri or that annoying customer-service chatbot apologizing for its repeated failure to understand you.)

I therefore agree with Turing. Show me a computer that can truly converse and I will stipulate it has achieved humanlike intelligence—even Artificial General Intelligence (aka, AGI), which is today’s buzzword for machines that can learn and, per Turing, think like we do.

But we’re still a while off. Despite rampant claims to the contrary, no computer has ever passed Turing’s Test. In fact, so-called “AI” has a serious hype problem:

Watson, it turns out, cannot cure cancer.
Google Duplex, while impressive at the highly constrained task of reservation booking, is nowhere near worthy of hysterical headlines like “Google Duplex beat the Turing Test: Are we doomed?”
Sophia, a mash-up of humanoid hardware and open-source software, has been decried as a parlor trick, a socio-political minefield, and “complete b——t;” she is not, as her creator boldly claimed, “basically alive,” nor anywhere close to sentient. (This didn’t stop the company from raising $36 million in 60 seconds to build AGI).
And Alexa, though adept at setting timers and re-ordering toilet paper, isn’t much of a cocktail conversationalist in spite of a multimillion dollar incentive from Amazon.

The industry itself suffers from “silver bulletism,” which is the tendency to latch onto the latest methodologies as be-all end-all breakthroughs. Even John McCarthy, father of the field and inventor of the foundational AI programming language Lisp, was forced to admit that “AI is harder than we thought” after asserting the language and other problems that still remain elusive could be solved in a mere summer with the right group of scientists.

Today’s silver bullets are “Big Data” and a branch of machine learning called “deep learning,” which trains artificial neural networks to learn by example without being pre-programmed with specific rules. These techniques have yielded significant progress on discrete AI domains like computer vision, image recognition, machine translation, and speech recognition and synthesis.

Yet the language problem remains unsolved.

In fact, machine learning not only hasn’t worked for natural language understanding and generation, but, when unsupervised, actually makes conversational systems worse. Chatbots that learn from chatters unchecked, particularly on the troll-infested internet, quickly veer into “Hitler-loving sex robot” territory, à la Microsoft’s Tay.

The alternative entails laboriously scripting responses (“rules”) to cover every possible anticipated input. This is how most chatbots function today, not through machine learning alone. Companies, from startups to the tech giants, quietly employ writers known as “conversation designers” to fill in where the computer fails. Their engineers hope that machine learning will “solve” conversation and obviate the need for human-authored repartee in a few decades. But creatives—or others more skilled at the art of conversation than computers and their programmers—may ultimately prove better suited to breaking the banter barrier. Just as artificial neural nets resulted from the cross-functional application of math, physics, and neuroscience, AI researchers should collaborate with other language-focused disciplines like the humanities and cognitive sciences.

And even then, we’re only at the start of our understanding. Ironically, like much of the human mind, how language works remains a mystery to neuroscientists. However, we do know speech is acquired socially and communication has nonverbal components. This begs the question of whether an AI would need to see us, read facial expressions, and emote—perhaps with a face of its own—in order to converse in a way that would pass the Turing Test.

That said, hopefully we won’t need to understand how we understand language before we can teach machines. (After all, the answer to artificial flight was aerodynamics, not copying birds). But the industry does need to admit we not only don’t know the answers, but may be approaching “human-centered” conversational AI with the wrong people, not the wrong algorithms.

Historically, humans have been forced to adopt communication protocols that computers can comprehend, from the command line to keyboard to search bar. A breakthrough in computer conversation would mean that, finally, technology is adapting to serve us by speaking our language, and not the other way around. It might also mean the singularity is nigh, when our minds could live forever by melding with machines.

But until then, as Shakespeare knew, language is our only immortality. Or as Horace wrote, Exegi monumentum aere perennius. (Words are “the monuments more lasting than bronze.”) An AI trained on a corpus of The Complete Works may be able to generate Shakespeare-ish sounding sonnets, but only when a computer can comprehend that 18, 19, 55, 65, 81, 107, and 123 share a common theme will it truly pass Turing’s Test.

“Deep-speare”

python sonnet_gen.py -m trained_model/ -d 2

Temperature = 0.6 – 0.8

01 [0.44] shall i behold him in his cloudy state

02 [0.00] for just but tempteth me to stop and pray

03 [0.00] a cry: if it will drag me, find no way

04 [0.40] from pardon to him, who will stand and wait

We’re thinking about the Turing Test all wrong

In 1950, five years before computer scientist John McCarthy would coin the term “artificial intelligence,” mathematician Alan Turing famously posited: Can machines think?

📬 Sign up for the Daily Brief

Our free, fast and fun briefing on the global economy, delivered every weekday morning.