Today, Facebook announced Deep Text, an AI engine it’s building to understand the meaning and sentiment behind all of the text posted by users to Facebook. In a blog post, Facebook said that it was building the system to help it surface content that people may be interested in, and weed out spam.
This might sound like a minor improvement, but it actually has the potential—in theory—to transform the social network most of us use every day into something else we use daily: a powerful search engine.
“We want Deep Text to be used in categorizing content within Facebook to facilitate searching for it and also surfacing the right content to users,” Hussein Mehanna, an engineering director at Facebook’s machine learning team, told Quartz.
The universe of that search may not be the whole worldwide web that Google crawls, but it’s still massive. There are over a billion people who check Facebook every single day, and the network has trillions of status updates, event invitations, photo albums, and videos on its servers. Facebook is sitting on an ever-growing mountain of information that it can use more effectively to connect people with similar interests, sell more ads, and help people find things they’re looking for.
Facebook already uses demographic information shared by users (whether directly or through their interactions with brands on the site), but right now, the majority of the text-based information Facebook has on its servers is unstructured, meaning Facebook doesn’t know users’ intent in posting, or even what users meant. Deep Text will help categorize and provide meaning for all that text, and could turn all that unstructured data into information it can use—and users can search. “If we can understand text, we can help people connect and share in a lot of different ways,” Mehanna said.
With this new project, Facebook is essentially building the capacity to track all the information put into the network, just as Google crawls the entire web for information and indexes it. What that means for users is that useful information among those trillions of posts might be a little easier to find—just as Google has started to use artificial intelligence to understand the questions we ask of it, and surface the right sort of information for us in its search results.
Facebook upgraded its search function last year to include a range of results, so that searching for ”taco,” for example, will land you results that include your friends’ pictures of tacos, local taco joints based on your geography, and international breaking news stories involving tacos. (Note: I wrote this while hungry.)
But Deep Text could, in theory, take that search a step further, by figuring out what those friends and brands are actually trying to say when they post about tacos, and then serving you the most useful results. If all you want to know is where to get a good taco, Deep Text could perhaps analyze the sentiment of your friends’ taco-related posts and use that to offer you a restaurant recommendation. If you’re looking for information about the many health benefits of tacos, it could serve you news articles on the latest in taco science.
Based on neural networks, Deep Text is unlike other systems designed to understand written language. Facebook says it can understand the meaning of thousands of posts per second, in 20 languages, “with near-human accuracy.” The system tries to understand the semantic relationships and similarities between words, meaning it realizes that “brother” and “bro” are often used in similar situations. The way that Deep Text has been trained on data means that it can also understand similarities across languages, so that it sees little difference between “happy birthday” and “feliz cumpleaños.”
Deep Text is already powering some aspects of Facebook, the company says. For example, some chat bots on Facebook Messenger can now understand if someone might need a taxi based on what they say. If someone texts, “I need a ride” to someone else, for example, a bot could interject to ask whether it should call them a taxi. Whereas if the message says, “I just took a taxi there,” the bot would know that no taxi is needed. Mehanna wouldn’t confirm whether Facebook was using Deep Text for its AI-based virtual assistant, M, but did say that Deep Text “is being rolled out slowly and more broadly across Messenger.”
Deep Text will also appear in other parts of Facebook. For example, if someone posts a message saying they’re selling something, Facebook’s commerce team will have all the information—the item, the price, the location, and so on—pulled from the post automatically, allowing the team to advertise services to help sell the item more easily.
In its blog post, Facebook said that it plans to use the millions of Facebook pages users have created to build up more training data for Deep Text. For example, the team will be using the Pittsburgh Steelers’ page to learn more about how people talk about American football and the Steelers. All of this data will help the team build up an AI system that understands the way that humans talk online, and the connection between words and sentences.
With all this information, Facebook will be able to tag and categorize everything posted on the site to make it easier to find things. Why leave Facebook to search for something on Craigslist if Facebook can tell you your friend—or a friend of a friend—is selling what you’re after. Why take a cab to a concert if Facebook can tell you someone you know is already planning to drive there?
Many have been saying, after a recent string of strong earnings results and CEO Mark Zuckerberg’s desire to produce moonshots like connecting the entire world via internet-beaming drones and curing every disease, that Facebook is the new Google.
Facebook’s greater ability to better parse all the information fed into its data centers every day can only help keep users on Facebook for their search needs, rather than losing them to Google. But it’s also worth considering: As Facebook gets better at offering us personalized search results from our networks, as useful as those might be, it also keeps us in a more insular version of the web, shaped by our own geography, demographics, affinities, and beliefs. Google does also does this, to an extent, but at least it searches the entire web first, not just our each of our own echo chambers.