Supercomputers may defeat chessmasters and crunch weather data, but the lesson of recent history is that they will never supersede human judgment, says Nate Silver, author of “The Signal and the Noise,” which hits bookstores Sept. 27.
Computers are best when, operating at “brute force calculation speed,” they churn out “Big Data.” But then an agile human mind must step in with the ingredient of experience in order to make our most intricate decisions, said Silver, the 34-year-old writer of the influential FiveThirtyEight blog at the New York Times. Otherwise, you can end up taking algorithms to an absurd extreme, such as the Wall Street banks did in creating the 2008-2009 global crisis. “The idea that machines are going to take over is, well I’d be happy to lay a lot of money against that coming true,” he told Quartz’s Steve LeVine.
A statistician with a degree in economics, Silver has made his name balancing computation and pure judgment. His break came in the 2008 US elections, in which Silver correctly forecast all 35 Senate races, and 49 of the 50 states in the presidential race. That led to a licensing deal for his blog at the Times, and a six-figure, two-book contract, of which The Signal and the Noise is the first.
The book considerably ups Silver’s game, setting himself out as an idealistic philosopher of correct and objective prognostication, and casting unabashed aspersions at rivals whom he regards as phonies. “You … have a lot of pollsters who are putting their finger on the scale and either trying to advance a partisan narrative or trying not to stick out,” he says. “It seems like more and more pollsters are not necessarily giving you the number they would if they didn’t have any preconception about the result. That scares me a little bit.”
What specifically is Silver frightened of? “The risk of a major screwup [because] pollsters are behaving in ways that are not independent of one another,” he says.
Silver comments on futurist Ray Kurzweil and contrarian analyst Nassim Taleb, while revealing some of the secrets of successful forecasting. Edited excerpts:
Quartz: The launching point in your book is that we have a “prediction problem”. You say we aren’t very good at it. Perhaps that is not surprising, and we ought to be thankful for the forecasting we can manage?
Nate Silver: There are a lot of things that we take for granted. For example, a couple of hundred years ago, it was very difficult to predict the orbits of the planets. There were forecasts that seemed to imply that Jupiter would crash into Saturn. We have gotten much better at scientific prediction. But in fields that involve human interaction, such as macroeconomic behavior, we are against a moving target. The economy has so many different sectors and different interactions between different parts that we have not made very good progress whatsoever. There is a gap between how good we think we are at prediction and how good we really are. There are two ways to remedy that gap. One is you can get better at prediction–there are things we can do to make some improvements. But it also means we might need to lower our expectations, as hard as that can be for us. There is no point in thinking that you can predict something to a high degree of precision when you just can’t.
Why are we driven to predict, and is that perceived need increasing?
We are making predictions whether we realize it or not. If you are taking a route to work, you are making a prediction of what will get you there faster. You are making all kinds of predictions when you are deciding whether to go out on a second date. And certainly when you are making any kind of business or planning decision, that is implicitly a prediction. What is interesting is that back when the words were introduced into the English language, the term “forecast” meant something like “foresight.” It meant planning under conditions of uncertainty, more like what we might mean by “prognostication.” And somehow those two things merged together. So an important distinction was lost, between predicting what was going to happen, and hoping for the best and preparing for the worst–the foresight part of it. I think we should reinsert that distinction into our language.
Poggio said that people tend to err by “finding patterns in random noise,” you write. Can you explain? How can we possibly escape the noise when there is more and more of it?
You can’t, but you can deduce it. You can increase the signal-to-noise ratio, and the way you can do that is by becoming more aware of what your biases and blindspots are. For example, if you ask the average American, “Hey is it a good time to invest in the stock market?” they will say, “Yes, if the market has been going up.” They will say “no” if it has been going down. Well, it turns out that one of the best indicators of what’s actually going to happen is the opposite. When the market is inflated and sentiment outstrips the fundamentals, then [share prices] go down, if not instantly then sooner or later. But when there has been a panic, then people are of course in a very bearish mood, but usually things aren’t quite as bad as people think. So your first instincts, your evolutionary “fight or flight” biological instinct, is not that useful when you’re dealing with [the amount of] data we get now. You have to slow down and think through things more carefully. Look at history, and don’t necessarily make a gut instinctual decision. You can sell a lot more books if you say, “Trust your gut.” But really when it comes to data analysis, it doesn’t always work very well. You have to be more routinized, have a method or a system in order to make yourself have more of a fighting chance.
The mistake of “Big Data” is the presumption that a lot of information obviates the need for human intervention, correct? Is there a data bubble? And is Big Data doomed?
I don’t think Big Data is doomed. I think it is going to create progress eventually, and it already has in some fields. But if you look at what happened at the dawn of the computer age in the 1970s and the early 1980s, people have these tools that they thought were magic boxes that would solve everything. And what actually happened is you had a decline in the number of new patent applications, a good measure of new discoveries that have economic value. And there were a lot of reasons for that. One was that all of a sudden you thought that you were really good at a lot of things that we really weren’t very good at.
Part of what the book is trying to start a conversation about is where does having more information help us. The skills that are key are learning how to sort through this wealth of data, and it’s hard. Even in a relatively confined space like political polling, it’s pretty hard. People tend to think that every poll is deeply meaningful. We go back and forth like a tennis match between different polls, some of which are outliers, looking for a consensus of information. The book encourages people to take it slow, to take a cue from people like weather forecasters and gamblers, who are making bets and predictions every day, and slowly refine and get better and reorient their instincts so that they have a better feel for how the data reads. But also they know when they should trust what their model says and when they [have reached] the limits of personal relations as well.
It’s almost a zen-like process, where on the one hand you can err by trusting your own instincts too much, and when you can err by putting too much faith in computer programs and models, because guess what? Someone has to design those, and that’s a person who made a lot of assumptions building that model.
Have you met Ray Kurzweil? Don’t your ideas contravene his forecast of a coming singularity, an age when artificial intelligence supersedes us as thinkers?
It’s probably wishful thinking. That would be the polite way to put it. It’s difficult to program a computer to tie its shoes or to fold a towel, things that any 6-year-old can do. What computers are good at are tasks like chess that have relatively simple rules and it is just a matter of making a lot of calculations. Weather forecasting also falls mostly into that category. But the term artificial intelligence is misapplied. It’s really brute force calculation speed, and if you have a program making very fast calculations, but with a dumb set of instructions, it’s going to be garbage in, garbage out. So I think that there is not going to be any singularity in terms of what Kurzweil might envision. The idea that machines are going to take over is, well I’d be happy to lay a lot of money against that coming true.
What about Nassim Taleb and the black swan notion of the power of fractional probability? Can one assume that his ideas figure in your models?
My work is pretty compatible with what he is saying a lot of the time. One thing he critiques is the idea that you can have pristine models that have clean mathematical assumptions, and they will work as well as they are supposed to. If you go back and look at the financial crisis, you had credit rating agencies that had models that said, “Oh these CDOs, collections of mortgage debt, are completely safe.” They gave them a triple-A rating, which implies they have only a 0.1% chance of defaulting. Thirty percent of them defaulted. You had an error of 30,000% between how the models claimed to be and how they really are.
You find the same thing if you look at some predictions of elections, where going back to 1992, you’ll have political scientists at a conference say, “We have these models that will exactly predict what the election will be months in advance without looking at polls.” Well, you have to look at polls. In 2000, those models had Al Gore winning the election by 10 points, a landslide, one of them winning by 18 or 19 points like Reagan did in ’84. Of course Gore lost. He won only the popular vote. People don’t realize that it is easy to go back and fit something when you already know what happened. Real prediction is much more difficult. People seem to need a lot of reminders before they learn that.
Does your daily detailed reportage impact the election, a political Heisenberg principle?
I hope not. I’m a little bit worried about that. One thing that I [tend] to see in the data is the polls clustering together. People get upset when there is a poll that is an outlier, even though there is physically some chance that if you poll 500 people rather than 5 million people there is going to be some random variance introduced. But it seems that pollsters now don’t want to be out of the consensus. If everybody says one result, then they do the same one. On one hand it is healthy. But you also have a lot of pollsters who are putting their finger on the scale and either trying to advance a partisan narrative or trying not to stick out. It seems like more and more pollsters are not necessarily giving you the number they would if they didn’t have any preconception about the result. That scares me a little bit.
One thing that can happen when you have people reacting to one another is a kind of black swan scenario, where instead of just missing by a little bit, all of a sudden you are way off. You have these “fat tails,” as they are called–if you take just a random sample in a poll, the error would be a bell curve shape. I actually build into our forecast models now a little bit of fat tail variance. But if everyone is reacting to everyone else’s poll, then you have a chance for herding and for the occasional catastrophic error. It’s my nightmare. If we have Obama up by 4 points on Nov. 6, then I sure hope that is what is going to happen. It will make me look good. But there is going to be some chance that he will win by 11 and some chance that he loses by 3 because there have been mistakes like that in polls before. The risk of a major screwup increases the more pollsters are behaving in ways that are not independent of one another.
Is it really true that the more facts we know the more partisan we will be? How is that possible? And if it is true, how will the Abhazians and Georgians ever make peace, less the Israelis and Palestinians?
One point of the book is to discourage partisanship. There is so much information in the world and so many different ways to consume it that if you have more data it gives you more opportunity to cherry pick the facts that you like. When they have an opportunity to pick which media outlets they are consuming, people are choosing their facts in a way that does not relate to objective reality. Some of these partisans on either side have dreams of grandeur of what is going to happen in this election that are not going to come true and will end up blaming the news media and lose further touch and think it’s all a conspiracy. I have my reputation. I am trying to be realistic. With as many flaws as markets have, at least in markets people are putting their money where their mouth is, and if you are delusional, you are going to lose your shirt. In politics, delusional people get on TV and get the ratings because partisans like what they have to say.
Can one predict geopolitical events?
The collapse of the Soviet Union, perhaps the most significant geopolitical event of the last half-century, was missed by the majority of Soviet scholars. Even now if you look back it was so obvious, where Gorbachev was trying to open up the markets, the economy was dilapidated, and there were problems with Afghanistan. You can look back on it and it looks so easy. But at the time it came as a real shock that the USSR disintegrated as fast as it did and in a largely bloodless way.
Your first job was as a consultant. I take it that political forecasting is better than consulting?
I like elections. I like making these forecasts. I’m not really a big fan of the political process. I do have political views certainly. But it’s not my job to tell candidates how to win. I am taking an investors or a gambler’s aloof attitude, saying, “What is my best take based on what I am seeing in the data?”