The Facebook like button was first released in 2009. As of September of 2013, a total of 1.13 trillion likes had been registered across the earth, according to OkCupid co-founder Christian Rudder in his new book Dataclysm. Much has been written about how “likes” limit our social interaction or increase our engagement with brands. But these likes have another function, they’re becoming a source of data that will eventually tell social scientists more about who we are than what we share.
According to a research group in the UK, it turns out that what people choose to “like” on Facebook can be used to determine with 95% accuracy whether they are Caucasian or African American, 88% accuracy whether they are gay or straight, and 65% accuracy whether they are a drug user, among other things. So what you post on Facebook may not give as true a signal of your genuine self as what you like on Facebook. Rudder writes:
“This stuff was computed from three years of data collected from people who joined Facebook after decades of being on earth without it. What will be possible when someone’s been using these services since she was a child? That’s the darker side of the longitudinal data I’m otherwise so excited about. Tests like Myers-Briggs and Stanford-Binet have long been used by employers, schools, the military. You sit down, do your best, and they sort you. For the most part, you’ve opted in. But it’s increasingly the case that you’re taking these tests just by living your life.”
Is it possible that in the future your SAT score, personality, and employability might simply be predicted by all the data collected from your digital device use? I asked Rudder whether a person’s like pattern on Facebook could be used as a proxy for an intelligence or IQ score. He told me:
“I think we are still far away from saying with any real certainty how smart any one person is based on Facebook likes. In aggregate, finding out that people who like X, Y, Z, have traits A, B, C, D, I think we’re already there. We’re already tackling life history questions based on Facebook likes. For example, did your parents get divorced before they were 21, they can unlock that with 60% certitude. Given that it’s only a few years’ worth of likes, imagine that it’s in five or 10 years and there’s that much more data to go on, and people are revealing their lives through their smartphones and their laptops.”
Rudder also points out that the power of large data is amplified once we are able to collect this data over many decades. Perhaps it even might answer many questions about how people change as they get older:
“A person’s views and cultural tastes change as they get older. For example, you usually like the music you like in high school for your whole life. My dad still listens to The Four Seasons and The Beach Boys and graduated in ’64. That’s not exactly true for me, but pretty close. And this question is pretty hard to answer with the current datasets. Having some way to look at how people’s minds change—not biologically but psychologically—over time would be amazing, measuring their level of tastes and preferences. You could plot social views versus economic views, a typical path through life could be that you start off socially and economically liberal and when you get a job there’s some tension there, and then maybe when I’m older and no longer working I’ll become more economically liberal. And it would be cool to be able to trace that beyond anecdotes and beyond polls. There’s so much research on early childhood development and how a child becomes an adult, but there isn’t as much developmental stuff about how adults stay adults or get older especially because there isn’t a parent reigning over every footstep once you’re 21. It would be incredibly interesting because people do keep changing in some ways and they tend to stay the same in others, and it would be amazing to know what ways those are.”
Rudder notes that many foundational ideas in social science “were established on small batches of college kids” and that “the full truth of data is only revealed over a large sample.” What this means is that these data sources can now be used to weed out ideas that are incorrect and also solidify what ideas are correct. The power of not only Facebook likes, but large data sources from OkCupid, Facebook, Twitter, Reddit, Craigslist, and all the information about our choices and behaviors from our digital devices, when tracked over decades, may prove to unlock important findings about human development. We may not realize it, but we are providing this data by living our lives. All scientists have to do now is wait.