How much is your data worth?

Cash for likes?
Cash for likes?
Image: Reuters/Dado Ruvic/Illustration
We may earn a commission from links on this page.

Today’s wealth lies in data. It’s how relatively young companies like Facebook or Google have grown to be among the biggest and most profitable in the world. It’s the fuel that drives business in countless industries that allows them to make informed business decisions. And as lawmakers turn their attention to how people’s data is managed, a central question arises: how much is consumer data worth, and should the companies benefitting from that information share the immense wealth our data has afforded them?

While consumers get to use services like Instagram or YouTube ostensibly for free, they are actually paying for them by providing copious amounts of their personal (and often sensitive) data to the tech companies that run them. In effect, the consumers benefit from what is also a massive surveillance system that is spying on them and enriching someone else. And those companies often fail to safeguard the data users provide them with. This can lead to massive data breaches, like the billion-or-so people exposed by the Yahoo Mail hacks, and data being shared with third parties, without users’ consent, as was the case in Facebook’s Cambridge Analytica scandal. In 2018, Facebook generated $55 billion in advertising revenue, while Google posted more than $116 billion in the same period.

Mark Warner, a senator from Virginia, and Josh Hawley, a senator from Missouri, introduced a bill on June 24 that would force social media companies to disclose what data they collect from consumers and how they benefit from it, something they’ve historically been reluctant to do. It would mandate the US Securities and Exchange Commission (SEC) to come up with a way to calculate the value of consumer data for companies whose services have more than 100 million users, and ascertain whether the companies were engaging in any anti-competitive behavior.

“It’s been done before,” Rachel Cohen, communications director for Warner’s office, told Quartz. She pointed out that people argued you wouldn’t be able to put a value on wireless communications and assigning value to the wireless spectrum, but ultimately, it was done. “The vision would be to do something similar.”

The bill itself doesn’t want to make companies pay users for their data, but other proposals under consideration in the US are considering an approach called a data dividend, which would in one way or another, compensate consumers.

California’s governor Gavin Newsom has proposed a data dividend, explicitly talking about “sharing the wealth” created from consumer data and lawmakers in the state are also exploring the issue. The efforts are still in the early stages.

“We are each giving away a lot of data—and therefore value—and so finding a way to make sure the customer has their piece of the pie seems pretty important at this stage,” Chris Hansen, a member of the Colorado House of Representatives who is also exploring consumer data policies, told Quartz. His vision would allow consumers to retain control and ownership of data, allowing the consumer to manage it or sell it as they please, from tech companies or otherwise. He mentioned grocery stores and retailer loyalty clubs that collect data on everything you buy from them. “The consumer right now doesn’t have any access or ownership of that data, and these companies make a massive amount of money,” Hansen said. And that’s just one example—Hansen added that it’s important to implement such solutions across the entire economy.

How do you calculate the value of user data?

The US Senate bill also doesn’t suggest a method of calculating the value of data, instead leaving that to the SEC to figure out. Cohen told Quartz that multiple calculations will probably have to be developed, considering companies in different sectors harvest different kinds of data. However, a number of solutions to establishing that value and redistributing it have been raised in the last few years.

One proposal is a simple tax solution, as Chris Hughes, the co-founder of Facebook and newfound critic of the company outlined last year. He argued that a 5% tax on companies that use consumer data—whether they’re a Silicon Valley giant, a bank, or a retailer—could generate at least $100 billion per year. Using the tax to fund a data dividend, every American adult would receive a check for about $400 per year. He compared his idea to the way revenue from oil extracted in Alaska is distributed to the state’s citizens, amounting to about $1,500 per person per year. “Unlike oil, this data is not an exhaustible resource, enabling the fund to disburse the total revenues each year,” Hughes wrote, noting that the check amount could increase over time.

Another idea is to base the data calculation on the metrics that the companies themselves provide, like taking a share of the average revenue per user (ARPU), which for US Facebook users, was $30 in the last quarter, or about $7.50 per month. Recode recently calculated that if you simply divide all digital advertising revenue in the US by the adult population, an ad-free internet would cost every US adult about $35 per month. That’s less than the cost of many live TV streaming services.

Economists have also considered the amount of money that people could have made if they were not consuming ad-based media—so scrolling through Facebook or reading the news—instead of working. One rough calculation estimated that this method would wield about $6,600 per user per year, so about $550 a month per month per person. About half of internet users surveyed by economists in 2017 have also said they would forego services like Facebook for about $40 per month.

Some companies are getting ahead of any potential legislation. UBDI (Universal Basic Data Income), a data exchange company, aims to let users benefit from selling insights to companies or researchers from their aggregated data. UBDI currently pays users with points that will be transferred into digital currency in the future, its website says. Shane Green, UBDI’s co-founder and chairman says he has been involved in conversations with the Senate regarding Warner and Hawley’s bill, and that he prefers what he considers a simpler method of calculation for the value of consumer data. His idea is to look at a company’s ARPU as well as its stock price, which he says represents what the market considers the value of the consumer data.

“It’s not just the revenue that [companies like Facebook] made for this year, it’s how spectacularly successful they’ve been at getting investors and Wall Street to understand the value of that kind of proprietary over time and the new kinds of products they can create—and the new kinds of targeting they can do,” Green told Quartz.

How companies would actually pay their users

Glen Weyl, an economist working with lawmakers in multiple California, Colorado, Canada and the EU, has devised a broad, systemic approach to paying consumers for their data, using intermediaries. According to his idea, outlined in the Harvard Business Review (paywall), there would be independent actors he calls “mediators of individual data” (MIDS) who can be most closely compared to labor unions. They would “help people bargain for the value of their data, rather than just have everything be determined technical protocols,” he said.

Weyl also noted that there needs to be a distinction between two kinds of data. While it might be relatively straightforward to calculate the effectiveness of ads (which the companies do themselves), the issue becomes more complex when we’re talking about data that’s used to train machine-learning models.

Weyl pointed Quartz to the Stanford computer-science researcher James Zou, who outlined his method for attributing value to machine learning data in a research paper. Zou uses an established concept used in economic game theory called the Shapley value, which essentially calculates the weight of everyone’s contribution to an outcome in order to fairly divide the bonus that is received at the end. Zou and his colleague, Amirata Ghorbani, extended this concept to data—so if two datasets are used to train a machine learning model, the Shapley value helps calculate how much each of these datasets contributed to the final performance of the machine learning model. The model allows for inputting massive amounts of data. However, it’s just an initial idea, and Zou says more research is needed to make it useful in real-world situations.

So when is one person’s data more valuable than someone else’s? Here’s how Zou explains it: “If I’m on Facebook and there are many other people who are very similar to me on Facebook, then my data is actually not that valuable, because there are many other people who could be substitutes.” So it’s about how unique your data is, and whether you actually buy anything from the ads you see online.

Of course, not everyone agrees that putting a price on consumer data is a valuable exercise at all, arguing that regular people would still be giving up their privacy, even if they are compensated for it.

“It is not a good deal for consumers to get a handful of dollars from companies in exchange for surveillance capitalism remaining unchecked,” the Electronic Frontier Foundation, a digital advocacy group, wrote in February, arguing that privacy protections should be lawmakers’ first priority.