Six questions to ask yourself when reading about AI

You’re being fed artificial intelligence about artificial intelligence.
You’re being fed artificial intelligence about artificial intelligence.
We may earn a commission from links on this page.

Hardly a week goes by without some breathless bit of AI news touting a “major” new discovery or warning us we are about to lose our jobs to the newest breed of smart machines.

Rest easy. As two scientists who have spent our careers studying AI, we can tell you that a large fraction of what’s reported is overhyped.

Consider this pair of headlines from last year describing an alleged breakthrough in machine reading: “Robots Can Now Read Better than Humans, Putting Millions of Jobs at Risk” and “Computers Are Getting Better than Humans at Reading.” The first, from Newsweek, is a more egregious exaggeration than the second, from CNN, but both wildly oversell minor progress.

To begin with, there were no actual robots involved, and no actual jobs were remotely at risk. All that really happened was that Microsoft made a tiny bit of progress and put out a press release saying that “AI…can read a document and answer questions about it as well as a person.”

That sounded much more revolutionary than it really was. Dig deeper, and you would discover that the AI in question was given one of the easiest reading tests you could imagine—one in which all of the answers were directly spelled out in the text. The test was about highlighting relevant words, not comprehending text.

Suppose, for example, that I hand you a piece of paper with this short passage:

Two children, Chloe and Alexander, went for a walk. They both saw a dog and a tree. Alexander also saw a cat and pointed it out to Chloe. She went to pet the cat.

The Microsoft system was built to answer questions like “Who went for a walk?” in which the answer (“Chloe and Alexander”) is directly spelled out in the text. But if you were to ask it a simple question like “Did Chloe see the cat?” (which she must have, because she went to pet it) or “Was Chloe frightened by the cat?” (which she must not have been, because she went to pet it), it would not have been able to find the answers, as they weren’t spelled out in the text. Inferring what isn’t said is at the heart of reading, and it simply wasn’t tested.

Microsoft didn’t make that clear, and neither did Newsweek or CNN.

Practically every time one of the tech titans puts out a press release, we get a reprise of this same phenomenon: minor progress portrayed as revolution. In another example, two years ago Facebook created a proof-of-concept AI program that could read a five-line summary of The Lord of the Rings and answer questions about where people and things ended up. (“Where was the Ring? At Mount Doom.”) The result was a slew of ridiculously over-enthusiastic articles, explaining how reading fantasy literature was the key to getting AI’s to read, with headlines like Slate’s “Facebook Thinks It Has Found the Secret to Making Bots Less Dumb.” (They didn’t.)

The consequence of this kind of over-reporting in the media? The public has come to believe that AI is much closer to being solved than it really is. So whenever you hear about a supposed success in AI, here’s a list of six questions you should ask:

  1. Stripping away the rhetoric, what did the AI system actually do here? (Does a “reading system” really read, or does it just highlight relevant bits of text?)
  2. How general is the result? (For example, does an alleged reading task measure all aspects of reading, or just a tiny slice of it? If it was trained on fiction, can it read the news?)
  3. Is there a demo where I can try out my own examples? (If you can’t, you should be worried about how robust the results are.)
  4. If the researchers—or their press people—allege that an AI system is better than humans, then which humans, and how much better? (Was the comparison with college professors, who read for a living, or bored Amazon Mechanical Turk workers getting paid a penny a sentence?)
  5. How far does succeeding at the particular task actually take us toward building genuine AI? (Is it an academic exercise, or something that could be used in the real world?)
  6. How robust is the system? Could it work just as well with other data sets, without massive amounts of retraining? (For example, would a driverless car system that was trained during the day be able to drive at night, or in the snow, or if there was a detour sign not listed on its map?)

AI really is coming, eventually, but it is further away than most people think. To get a realistic picture, take everything you read about dramatic progress in AI with a healthy dose  of skepticism—and rejoice in your (for now) uniquely human ability to do so.

This essay is adapted from Rebooting AI, published this month by Pantheon.