The Australian election produced a winner no pollster predicted: Prime minister Scott Morrison’s ruling coalition remained in power, despite all expectations to the contrary.
After the surprise outcomes of the Brexit referendum and the election of Donald Trump as US president, what now feels like the most predictable outcome of any election is that the pollsters will be wrong. There are many reasons why they keep getting it wrong, including confirmation bias, when journalists and pollsters looking for data that validates their prior beliefs. It could also be the nature of some of these events is so unusual they were impossible to predict.
Forecasts rely on data from the past, and while we now have better data than ever—and better techniques and technology with which to measure them—when it comes to forecasting, in many ways, data has never been more useless. And as data become more integral to our lives and the technology we rely upon, we must take a harder look at the past before we assume it tells us anything about the future.
To some extent, the weaknesses of data has always existed. Data are, by definition, information about what has happened in the past. Because populations and technology are constantly changing, they alter how we respond to incentives, policy, opportunities available to us, and even social cues. This undermines the accuracy of everything we try to forecast: elections, financial markets, even how long it will take to get to the airport.
But there is reason to believe we are experiencing more change than before. The economy is undergoing a major structural change by becoming more globally integrated, which increases some risks while reducing others, while technology has changed how we transact and communicate. I’ve written before how it’s now impossible for the movie industry to forecast hit films. Review-aggregation site Rotten Tomatoes undermines traditional marketing plans and the rise of the Chinese market means film makers must account for different tastes. Meanwhile streaming has changed how movies are consumed and who watches them. All these changes mean data from 10, or even five, years ago tell producers almost nothing about movie-going today.
We are in the age of big data that offers to promise of more accurate predictions. But it seems in some of the most critical aspects of our lives, data has never been more wrong.
Take this explanation from Nate Silver about what went wrong in the 2016 election, where he gave Trump a 29% chance of winning the election, much higher odds than almost everyone else.
FiveThirtyEight’s probabilities are based on the accuracy of polling averages in presidential elections dating back to 1972. That is, our models are based on how accurate polls have or haven’t been historically, instead of making idealized assumptions about them.
This is the best practice. You need to ground your projections in something, and data from the past is the best we’ve got. Elections forecasts are estimates are based on polling a small subset of potential voters. Models make predictions about who will actually turn out and how undecided and swing voters will vote. But those projections are based on how they behaved in the past; often on elections in a completely different climate with two fairly standard candidates. Many models failed in 2016 because they assumed past elections would be somewhat like the 2012 or 2008 or earlier elections. Normally that would be a fair assumption, but not in 2016, nor with Brexit, nor in Australia.
This isn’t just undermining polling. Economic policy markers also operate under the assumption the present will be like the past. In a video on the website of the National Bureau of Economic Research, economist Laura Veldkamp discussed the low-interest rate environment in the years following the financial crisis. She argues the world has changed since the crisis. People are now more risk averse and will pay more for risk-free assets, which she speculates is why interest rates have stayed so low. Interest rates are a market price, set at auction. When more investors want low-risk assets, like government bonds, there are more bidders driving prices higher, and rates lower.
If it’s true that investors fear risk more and will pay a higher price for safety, the old relationships around risk taking will no longer apply. Take the Federal Reserve’s policy of trying to boost the economy by cutting rates, based on the hope lower rates will induce investors to speculate with their money. But if investors are more risk-averse, they’ll tolerate lower rates and not take more chances. This means monetary policy tools will be less predictable or effective for the foreseeable future.
These challenges will become more apparent in the coming years as we become more dependent on data. Big data—enormous sets of information based on many observations—can be a valuable tool, shaping how corporations make business decisions and how they market their products and ideas. Data also powers artificial intelligence, so it can learn, update, and be more accurate. Data may be heralded as the answer to everything but even if the information is newer and bigger, how do you know if your data provides useful insights about the future?
The answer may come from the same place it did in the past: theory. Because data rarely speaks for itself, it needs a story to guide it. For example, in the case of long-term interest rates, if you relied on all previous interest-rate data, they may lead you to assume interest rates will rise soon. But if you had a theory that told you investors now value low-risk assets more, you’d know to put a larger weight on more recent data. The successful data users of the future—pollsters, economists, or even pure data scientists—will require more than training about how to use that information, but an intellectual framework to understand and make sense of it.