Science can now spot trolls after just five horrible, malicious comments

Let the troll hunting commence.
Let the troll hunting commence.
Image: Reuters/Morris Mac Matzen
We may earn a commission from links on this page.

People aren’t very nice to each other online. Everyone has read a comment thread and been annoyed at the vicious remarks, or witnessed a flame war under a YouTube video. Many online communities now have moderators, and they aim to ban trolls—those people who just who can’t stay civil.

But can you identify trolls before they ruin a community? Researchers from Stanford and Cornell think they can (pdf), after analyzing 18 months worth of Disqus threads from the news site CNN, the right-wing political site Breitbart, and the gaming site IGN. That amounted to 1.7 million users, almost 40 million comments, and 100 million up- or down-votes on those comments.

They compared users who were later banned from a community with users who were never banned. Of the trolls, they observed:

We find that such users tend to concentrate their efforts in a small number of threads, are more likely to post irrelevantly, and are more successful at garnering responses from other users. Studying the evolution of these users from the moment they join a community up to when they get banned, we find that not only do they write worse than other users over time, but they also become increasingly less tolerated by the community. Further, we discover that antisocial behavior is exacerbated when community feedback is overly harsh.

Using that, they developed an algorithm that can look at those factors and determine with 80% accuracy whether the troll will be banned in the future based on the content of their first five posts. The most powerful predictor is the reaction to (and deletion of) posts by moderators—but even without manual labeling, the algorithm still works with 79% accuracy.

Interestingly, the researchers said it becomes “increasingly difficult” to identify a troll and determine whether they’ll be banned the longer they have been posting. “This suggests that changes in both user or community behavior do occur leading up to a ban,” the scientists said.

The researchers’ work was supported by a Google faculty research award, so perhaps we’ll see this algorithm being used to identify and banish online trolls sooner rather later. But, as the researchers note, a success rate of 80% means that one in five users are still misclassified as trolls when they are just anti-social. What to do about this? The researchers are hopeful:

Whereas trading off overall performance for higher precision and have a human moderator approve any bans is one way to avoid incorrectly blocking innocent users, a better response may instead involve giving antisocial users a chance to redeem themselves.