MIT economists are combating groupthink with algorithms

Most algorithms designed to glean knowledge from collective wisdom weigh individual votes the same and assume the majority’s right. That’s an assumption built into many mathematical models. Scientists from MIT say this is all wrong.

“Once considered provocative, the notion that the wisdom of the crowd is superior to any individual has become itself a piece of crowd wisdom, leading to speculation that online voting may soon put credentialed experts out of business,” writes MIT economist Drazen Prelec. “Algorithms for extracting wisdom from the crowd are typically based on a democratic voting procedure. However, democratic methods have serious limitations.” Most notably, they assume the answers of all individuals are worth the same weight.

His team proposes a different approach: instead of trying to figure out what’s right based on the most popular answer, a study led by Prelec and published in Nature on Jan. 25 suggests that a better formula for finding a correct answer in a crowd is to give weight to “surprisingly popular” responses to questions. The idea is that this alternative algorithm would account for the probability that there are some outliers in the crowd who know more than most. This way, mathematical methods start to approach how people seriously solve problems: by seeking out the rare few who are knowledgeable on a topic.

For example, in one test, Prelec’s team asked 51 graduate business students about US state capitals, using a simple yes/no formula: “Is Philadelphia the capital of Pennsylvania?” Most said yes, it was also the capital, presumably because Philly’s a pretty famous city and taught as an important place in US history. But they’re wrong. Harrisburg is the capital. Thus the majority vote was wrong and the answer “no” was actually correct.

In this simple test, outlier responses very obviously indicate a knowledge that the majority did not have. A similar effect might occur if you asked about the capital of New York, which is Albany and not New York City. The MIT algorithm was able to take the answer sets from Pennsylvania and New York, and figure out, based on the fact that there were outliers, determine that Philly and NYC were not the capitals of their respective states—despite the fact that the majority said they were.

For common knowledge—like Boston vs. Quincy as the capital of Massachusetts—outlier answers obviously wouldn’t be better. But the goal of the project was to come up with an algorithm that solves problems without simple answers.

The researchers did four kinds of surveys in all. With each test, the questions became more complex, and the algorithmic analyses were adjusted to be more nuanced. The result is a model that recognizes unusual answers become more valuable the less common knowledge is for a given subject area.

In a later test, for example, a group of 25 dermatologists was asked to diagnose 80 skin lesion images as benign or malignant on a scale of 1 to 6. Here, there were numerous considerations and all the survey participants were educated about the topic. Nonetheless, outlier answers could contain critical wisdom that the crowd missed. For example, say four of the doctors give what appear to be be totally outside-the-norm answers to images #26 and #71. Well, we could say they are just wrong, and we should trust the crowd. But, what if those four docs have tons of clinical experience with a very particular type of lesion, and as a result, are actually the only ones getting #26 and #71 right? In the MIT model, those “surprisingly popular” answers, which form their own pattern, are weighted differently (although the algorithm can’t recognize clinical experience, just that there are statistical outliers).

Across the surveys, Prelec’s team found that most of the time, their algorithm was 21–36% more effective at identifying the correct answer than other formulas, like relying on the most popular answer or ranking answers by probabilities.

Still, there are limits to what it can do. Prelec told Quartz, “When we claim that the surprisingly popular answer is ‘best’ we are thinking of problems where everyone agrees about objectives, but disagrees only about underlying facts or evidence.” In other words, the model can’t find “best” responses to political or philosophical questions.

home our picks popular latest obsessions search