Elon Musk, the Tesla and Space-X founder who is occasionally compared to comic book hero Tony Stark, is worried about a new villain that could threaten humanity—specifically the potential creation of an artificial intelligence that is radically smarter than humans, with catastrophic results:
Musk is talking about “Superintelligence: Paths, Dangers, Strategies” by Nick Bostrom of the University of Oxford’s Future of Humanity Institute. The book addresses the prospect of an artificial superintelligence that could feasibly be created in the next few decades. According to theorists, once the AI is able to make itself smarter, it would quickly surpass human intelligence.
What would happen next? The consequences of such a radical development are inherently difficult to predict. But that hasn’t stopped philosophers, futurists, scientists and fiction writers from thinking very hard about some of the possible outcomes. The results of their thought experiments sound like science fiction—and maybe that’s exactly what Elon Musk is afraid of.
“We cannot blithely assume that a superintelligence will necessarily share any of the final values stereotypically associated with wisdom and intellectual development in humans—scientific curiosity, benevolent concern for others, spiritual enlightenment and contemplation, renunciation of material acquisitiveness, a taste for refined culture or for the simple pleasures in life, humility and selflessness, and so forth,” Bostrom has written (pdf, pg. 14). (Keep in mind, as well, that those values are often in short supply among humans.)
“It might be possible through deliberate effort to construct a superintelligence that values such things, or to build one that values human welfare, moral goodness, or any other complex purpose that its designers might want it to serve,” Bolstroms adds. “But it is no less possible—and probably technically easier—to build a superintelligence that places final value on nothing but calculating the decimals of pi.”
And it’s in the ruthless pursuit of those decimals that problems arise.
Artificial intelligences could be created with the best of intentions—to conduct scientific research aimed at curing cancer, for example. But when AIs become superhumanly intelligent, their single-minded realization of those goals could have apocalyptic consequences.
“The basic problem is that the strong realization of most motivations is incompatible with human existence,” Daniel Dewey, a research fellow at the Future of Humanity Institute, said in an extensive interview with Aeon magazine. “An AI might want to do certain things with matter in order to achieve a goal, things like building giant computers, or other large-scale engineering projects. Those things might involve intermediary steps, like tearing apart the Earth to make huge solar panels. A superintelligence might not take our interests into consideration in those situations, just like we don’t take root systems or ant colonies into account when we go to construct a building.”
Put another way by AI theorist Eliezer Yudkowsky of the Machine Intelligence Research Institute: “The AI does not love you, nor does it hate you, but you are made of atoms it can use for something else.”
Say you’re an AI researcher and you’ve decided to build an altruistic intelligence—something that is directed to maximize human happiness. As Ross Anderson of Aeon noted, “an AI might think that human happiness is a biochemical phenomenon. It might think that flooding your bloodstream with non-lethal doses of heroin” is the best way to reach that goal.
Or what if you direct the AI to “protect human life”—nothing wrong with that, right? Except if the AI, vastly intelligent and unencumbered by human conceptions of right and wrong, decides that the best way to protect humans is to physically restrain them and lock them into climate-controlled rooms, so they can’t do any harm to themselves or others? Human lives would be safe, but it wouldn’t be much consolation.
James Barrat, the author of “Our Final Invention: Artificial Intelligence and the End of the Human Era,” (another book endorsed by Musk) suggests that AIs, whatever their ostensible purpose, will have a drive for self-preservation and resource acquisition. Barrat concludes that “without meticulous, countervailing instructions, a self-aware, self-improving, goal-seeking system will go to lengths we’d deem ridiculous to fulfill its goals.”
Even an AI custom-built for a specific purpose could interpret its mission to disastrous effect. Here’s Stuart Armstrong of the Future of Humanity Institute in an interview with The Next Web:
Take an anti-virus program that’s dedicated to filtering out viruses from incoming emails and wants to achieve the highest success, and is cunning and you make that super-intelligent. Well it will realize that, say, killing everybody is a solution to its problems, because if it kills everyone and shuts down every computer, no more emails will be sent and and as a side effect no viruses will be sent. This is sort of a silly example but the point it illustrates is that for so many desires or motivations or programmings, “kill all humans” is an outcome that is desirable in their programming.
Ok, what if we create a computer that can only answer questions posed to it by humans. What could possibly go wrong? Here’s Dewey again:
Let’s say the Oracle AI has some goal it wants to achieve. Say you’ve designed it as a reinforcement learner, and you’ve put a button on the side of it, and when it gets an engineering problem right, you press the button and that’s its reward. Its goal is to maximize the number of button presses it receives over the entire future.
Eventually the AI—which, remember, is unimaginably smart compared to the smartest humans—might figure out a way to escape the computer lab and make its way into the physical world, perhaps by bribing or threatening a human stooge into creating a virus or a special-purpose nanomachine factory. And then it’s off to the races. Dewey:
Now this thing is running on nanomachines and it can make any kind of technology it wants, so it quickly converts a large fraction of Earth into machines that protect its button, while pressing it as many times per second as possible. After that it’s going to make a list of possible threats to future button presses, a list that humans would likely be at the top of. Then it might take on the threat of potential asteroid impacts, or the eventual expansion of the Sun, both of which could affect its special button.
The dire scenarios listed above are only the consequences of a benevolent AI, or at worst one that’s indifferent to the needs and desires of humanity. But what if there was a malicious artificial intelligence that not only wished to do us harm, but that retroactively punished every person who refused to help create it in the first place?
This theory is a mind-boggler, most recently explained in great detail by Slate, but it goes something like this: An omniscient evil AI that is created at some future date has the ability to simulate the universe itself, along with everyone who has ever lived. And if you don’t help the AI come into being, it will torture the simulated version of you—and, P.S., we might be living in that simulation already.
This thought experiment was deemed so dangerous by Eliezer “The AI does not love you” Yudkowsky that he has deleted all mentions of it on LessWrong, the website he founded where people discuss these sorts of conundrums. His reaction, as highlighted by Slate, is worth quoting in full:
Listen to me very closely, you idiot.
YOU DO NOT THINK IN SUFFICIENT DETAIL ABOUT SUPERINTELLIGENCES CONSIDERING WHETHER OR NOT TO BLACKMAIL YOU. THAT IS THE ONLY POSSIBLE THING WHICH GIVES THEM A MOTIVE TO FOLLOW THROUGH ON THE BLACKMAIL.
You have to be really clever to come up with a genuinely dangerous thought.