AI researchers are trying to combat how AI can be used to lie and deceive

A 20th century horse has become an apt example of AI’s naivety.
A 20th century horse has become an apt example of AI’s naivety.
Image: Karl Krall, Denkende Tiere, Leipzig 1912, Tafel 2
We may earn a commission from links on this page.

Among researchers studying how AI can be used to lie and manipulate the world, there’s a feeling that 2017 has been the calm before the storm.

The past few years have brought example after example from research labs of how AI can generate videos of politicians saying literally anything, or potentially trick self-driving cars into speeding past stop signs. But nobody, to the field’s knowledge, has actually used the technology for malicious purposes.

In an effort to get in front of this perceived threat, hundreds of AI researchers will gather today (Dec. 8) at Neural Information Processing Systems (NIPS), AI’s largest and most influential conference, to acknowledge the potential deceptive powers of artificial intelligence, and discuss countermeasures against them.

During a workshop on machine deception, attendees will focus on four main topics:

  • Synthetic media, where AI is used to fake video footage or audio clips of someone speaking
  • Fooling the machine, where adversarial examples are used to trick AI into seeing something that isn’t there
  • Deceptive agents, AI-powered bots meant to sow disinformation and propaganda
  • Policy and ethics, or how to advise regulators and shape ethics for pursuing this kind of research

Tim Hwang, director of ethics and governance at the Reed Hoffman-backed AI Initiative and co-organizer of the workshop, says he realized this area needed more attention after a 2016 research project called Face2Face showed that AI could be used to realistically imitate politicians like Donald Trump and Vladimir Putin.

“I was really surprised, and it got me thinking… when are these techniques going to be used, and what do we think their impact will be?” Hwang said. “Especially if you think the inputs to do machine learning are getting lower and lower over time.”

Hwang notes that for all the insidious AI demonstrated in academia and industry, we’ve seen little of it in the real world: Most digitally altered images are created manually, by someone skilled in software like Photoshop. But just as the debate over fake news zeroed in on manual disinformation, Hwang expects another debate to emerge over how to thwart disinformation powered by AI. The question will be whether larger groups of people will be able to tell reality from fiction, or whether technological authentication of media will become completely necessary to trust anything online.


Researchers have compared modern AI to Clever Hans, a performing horse in the early 1900s who could allegedly do arithmetic. The horse’s trainer would ask him a simple equation, like 2+2, and Hans would tap his hoof the appropriate number of times. It was later revealed that Hans couldn’t actually do math, but was instead reading his trainer’s face to see when to stop tapping and receive a reward. In other words, Hans was giving correct answers, without understanding why.

As modern AI systems begin to power more services, such as malware detection and automatic insurance quoting, some are beginning to question whether such systems themselves “understand” enough to avoid deception.

“These are the consequences of systems that are trained to exhibit features of human intelligence, but are fundamentally different in terms of how they process information,” says Bryce Goodman, a co-organizer of the NIPS workshop. ”We’re trying to show what hacks are possible and make it public.”

Goodman says tech companies are incentivized to eke out small accuracy gains in research papers, and then bring their flagship, state-of-the-art algorithms into production doing critical tasks—potentially tasks like recognizing abusive imagery on a social media platform. But such an algorithm has only been optimized for accuracy, rather than other factors, like its security against being tricked.

Many technology companies also share such code with the rest of the world by releasing it online under open-source licenses, meaning anyone can modify and reuse the algorithms. If the system’s security has not been tested, that adds its own concerns.

“Open source has allowed people to make huge progress, but one of the challenges is that the vulnerabilities in those networks get passed around,” Goodman says.

Creating metrics that would indicate how resistant algorithms are to malicious attacks could be a way of building a community of AI engineers that take security into consideration, Goodman says, but no such metrics yet exist. He hopes that the workshop’s speakers and co-organizers—among them big names like Google’s Ian Goodfellow (who has pioneered much of the work in machine-learning security)—have the ability to push progress toward prioritizing such metrics, and by proxy security.

“These are the who’s who of deep learning, period,” Goodman says. “I think it’s a topic that’s resonating really deeply. What we’re dealing with is an opportunity to get a better insight into how algorithms are working, and therefore improving overall. By breaking something, you come to understand it much better than putting it on a pedestal or waiting for it to fail.”