UNDER THE HOOD

A glimpse into Facebook’s notoriously opaque—and potentially vulnerable—Trending algorithm

With its human editors out the door, Facebook is letting the inmates run the asylum.

This week, a story about a man violently engaging a McChicken in intercourse lived as a topic on Facebook’s Trending bar for more than a day—despite the fact that an extraordinarily small percentage of Facebook’s US user base was talking about it, according to Trending’s own metrics. A false news story about Fox News anchor Megyn Kelly also found its way to the Trending section of Facebook’s homepage and enjoyed a similar lifespan to the McChicken story on the site.

Facebook, a service with 1.7 billion users per month, insists at every opportunity that it’s not in the business of news. Despite that assertion, it built a module that interprets relative popularity and impact of stories, much like a modern news editor, displaying trending stories prominently on its homepage. With 63% of Facebook users getting news from the social network, Trending topics has become a powerful way to surface what the Facebook community deems important. But without an active curation team to weed out hate speech and general noise, the section is susceptible to the most basic problems of and attacks on the internet. The inmates, Facebook users, have unprecedented control over the news that others in their region see.

Facebook hasn’t been forthcoming about how its algorithm works; the company declined to answer any questions for this article. It’s exceedingly vague in blog posts about the algorithm’s methodology, and the software is absent in every engineering blog. But company patent filings, along with general information Facebook has shared publicly and with Quartz in the past week, and interviews with previous Facebook curators, give us a glimpse into how the Trending algorithm works.

Save for a team of engineers whose job is to spike mundane stories like #lunch, Trending is now run entirely by an algorithm. The humans who are still working on Trending are bound by a rulebook that requires them to let every algorithmically surfaced topic reach the public eye, so long as it corresponds to a real event and isn’t a duplicate topic, according to Facebook’s newly updated rules. “The review team is responsible for accepting all algorithmically detected topics that reflect real-world events,” says the guide.

The team is also obliged by the same rules to believe that every topic it sees is true, unless proven otherwise by reviewers corroborating with other sources from Facebook’s internal review tool—rather than the traditional journalistic approach of verifying through external sources. The guide has no mention of hate speech, though overtly sexual content has its own specific internal “topic” in Trending, like Business or Politics:

“Topics related to sex, pornography, nudity, graphic violence, etc. Stories that can be perceived as R-rated or worse should be tagged risqué (a.k.a. Chris Hemsworth’s daughter discussing his penis).”

An excerpt from the Facebook Trending Reviewer’s guide. (Facebook)

But before a story passes through this sieve, it must first be surfaced by the Trending algorithm. Amid the controversy in May that Facebook was burying conservative news, first reported by Gizmodo, vice president of global operations Justin Osofsky said the algorithm “identifies topics that have recently spiked in popularity on Facebook,” which he explained as “a high volume of mentions and a sharp increase in mentions over a short period of time.” Back then, Trending also looked at a list of 1,000 news sites to verify these stories, but that practice was discontinued after the backlash from Gizmodo’s story.

However, “mentions” isn’t a defined metric on Facebook. A patent for trend detection awarded to the company on July 5, 2016 uses the term to describe plain or hashtagged text that could be considered trending. Additional information supplied to Quartz by Facebook this week suggests shares and likes are also weighed within the algorithm—which means those also probably fall under the blanket definition of a mention. But the patent notes that not every mention is created equally:

“Some users are voracious in interactivity, and because of the large volume of content that they generate, their interactivity may elicit less interaction from other users, and the online enterprise may put on less weight on instances of their interactivity”

Based on the description, Facebook seems to devalue content shared by people who post and share frequently on the social network. Their shares are subsequently weighed less by the algorithm.

According to the patent, as the algorithm processes mentions and finds matching keywords, like NBA Finals or Super Bowl, it begins to sort them into a topic. Then the topic is given a Trending score based on geographic location of users posting about it, the timeframe during which those posts were made, and weighted number of mentions, according to the patent. The most important part of this score seems to be timing—the more mentions over a shorter time period, the higher the Trending score the topic gets. After the Trending score is assigned internally, and ostensibly approved by a reviewer, the Trending topics that are displayed are personalized for each user. Geographic location is a big factor, but the algorithm described in the patent also weighs “gender, race, national origin, religion, age, marital status, disability, sexual orientation, education level, and socioeconomic status,” as well as previous interactions with other content.

Of course, this is only a patent and doesn’t necessarily correspond to what Facebook is currently using. However, the description in the patent matches extremely well with the information Facebook has publicly released.

One thing the patent does not discuss at length is filtering mechanisms. According to the patent, spam—defined as “unsolicited bulk messages, especially advertisements, sent indiscriminately and/or repeatedly”—is the only thing the system specifically filters out. While this could protect against single spammers, it seems a large group of coordinated Facebook users could force a particular topic of any kind—including hate speech—to trend, with no recourse available from the review team. Facebook could have a mechanism to prevent this, but it makes no mention of such in its patent (and declined to respond to questions from Quartz on the subject). Other social media sites, like Twitter, say they filter trending topics for quality, but are equally vague in their explanation of how that process works.

According to a person who worked on the Trending team, the editorial options on Facebook’s internal tool changed after Gizmodo reported on its biases. “We used to be able to inject topics,” said the former contractor, who agreed to speak to Quartz on the condition of anonymity. When the algorithm failed to pick up a major news story—what Facebook defined as “false negatives”—its curators would be able to manually insert it, this person said. “After Gizmodo, we couldn’t do that anymore.” Facebook Trending editors were also told to use Facebook Live video as the main video source on a curated topic wherever possible, even if other sources were available.

For those who cared to look, the writing was on the wall. More of the editors’ and curators’ jobs had become automated in recent months, other former Trending team members told Quartz and other outlets. But the problem was that the system wasn’t necessarily up to snuff. “The algorithm wasn’t very good,” a former editor told Quartz. “It was not on par with what a human could curate.” Another source confirmed to Quartz that in an eight-hour shift each curator would remove hundreds of stories, approving 20-25 in the same time period.

With human control gone and an untested algorithm at the helm, Facebook’s Trending module is likely to face additional bumps in the road and possibly even disasters in the near future. Facebook has promised faster takedowns of false new stories, but its solution is reactive. Meanwhile, the rising popularity of the alt-right movement, and legions of internet trolls with too much time on their hands and not enough conscience, have shown themselves to be motivated mostly by who can get the largest rise out of the collective internet. Imagine their ability to gain access to millions of users’ timelines. Facebook’s algorithmic reliance makes it easy to shrug off claims of bias, especially in the wake of Gizmodo’s investigation. Of course, Facebook has the right to make that decision for its platform. Regardless, it looks like users will be able to have their say, too.

home our picks popular latest obsessions search