Not that long ago, the concept of “Big Data” was pretty abstract. Few companies considered it feasible to sift through huge sets of data looking for speculative insights. The hurdles to collecting and analyzing information at scale were large, tied to the cost of setting up a data warehouse and buying expensive analysis software. Also, data to supplement company-owned information was expensive and hard to come by.
Cloud computing from companies like Amazon and Microsoft eliminate the need for a data warehouse. Powerful, and free, data analysis programs like R and Python make number crunching cheap. Tons of free datasets are available from governments and companies like Google and Kaggle. Easy-to-use machine learning algorithms are freely available from TensorFlow and Caffe.
This is a boon for companies, but many have yet to take full advantage of the possibilities. According to Hal Varian, Google’s chief economist, that’s because there simply aren’t enough qualified people to help companies work with all the data now at their disposal. In a recent working paper (pdf), Varian explains:
In my experience, the problem is not lack of resources, but is lack of skills. A company that has data but no one to analyze it is in a poor position to take advantage of that data. If there is no existing expertise internally, it is hard to make intelligent choices about what skills are needed and how to find and hire people with those skills. Hiring good people has always been a critical issue for competitive advantage. But since the widespread availability of data is comparatively recent, this problem is particularly acute.
Varian goes on to argue that while a car company might be good at hiring engineers, it may not know how to identify good data scientists. He believes this leads to big variations in how well companies are able to harness their data. As skilled data scientists become more common, these variations should be reduced, and more companies will use data to become more productive.
Varian is certainly right that the dearth of data analysts is acute—IBM refers to it as a “quant crunch.” According to data from job website Indeed, data scientist job postings in the US jumped by 75% from January 2015 to January 2018. The Bureau of Labor Statistics expects that statistician will be the fastest growing non-health related occupation in the 10 years to 2026.
It’s not just a US problem. A McKinsey Global Institute report found that demand for data scientists outstrips supply across the world. It concluded that there are not nearly enough to graduates in data science to fill vacancies. As a result, salaries for data scientists are climbing unusually quickly, and companies with top quantitative talent are being acquired at surprising high valuations.
In response, the number of data science programs offered at universities is rising, and more people are acquiring these skills through coding bootcamps. If Varian is right, this once obscure profession is key to determining the future of the global economy.