For more on new technology that can read human emotions, check out the third episode of Should This Exist? the podcast that debates how emerging technologies will impact humanity.
If we were sitting across a table from each other at a cafe and I asked about your day, you might answer with a polite response, like, “Fine.” But if you were lying, I’d know from your expression, tone, twitches, and tics. Because most of our communication isn’t linguistic.
We read subtext—unspoken clues—to get at the truth, to cut through what people say to understand what they mean. And now, with so many of our exchanges taking place in text online, much of our messaging, traditionally delivered via subtext, tells us less than ever before.
Rana el Kaliouby, the co-founder of Affectiva, a company that teaches machines sentiment analysis, wants to improve the tools we use and make exchanges great again. “A lot of today’s communication—93%—is essentially lost in cyberspace,” she tells Quartz. “We’re emotion-blind and that’s why we’re seeing less compassion in the world.” The solution, in her view, isn’t to stop using technology that strips us of our humanity, but instead to design tools that truly understand humans.
El Kaliouby‘s company creates tools to navigate the space between language and meaning.
Technically speaking, she and her colleagues are compiling a database of the world’s facial expressions to get the big picture of human communication. So far, they’ve collected 7.7 million faces in 87 countries, and 5 billion facial frames total. The idea is that if machines can read subtext, then in certain contexts, they will better serve our needs.
Take, for example, online learning. Imagine that you’re taking a class and getting lost. In theory, your frown, wandering gaze, and frustration would be conveyed to the computer through the camera, which would alert the system so the course could respond accordingly. Maybe it would offer you more examples, or easier problems. Maybe it would even change topics to prevent frustration, just as a live instructor in a classroom can switch activities or tactics depending on how students respond to the material.
El Kaliouby’s work has already been put to use in interesting ways. Automated sentiment analysis can help people with autism, who can struggle to interpret the emotional subtext in exchanges, better understand communication by interpreting an interlocutor’s data and providing insight. A device, which is worn like glasses and resembles the now-defunct Google Glass tool, can signal to a user when they are ignoring important unspoken clues so that they don’t solely rely on language to judge a situation.
El Kaliouby has used her own tool to gauge listener reception when doing webinars, too. Normally, when speaking to a group online, a presenter can’t tell whether anyone is paying attention. But with technological help, even an online lecturer can get a sense of audience engagement and deliver their message more effectively as a result, she says. By having information on the speaker’s screen alerting them to audience engagement levels based on their expressions, she’s been able to give better presentations, she says.
Advertisers have also used the tool to test audience responses to a potential campaign. Viewers watch an ad as Affectiva’s tech judges their expressions. By quantifying the viewers’ unspoken real time responses, marketers get a better sense of their ad’s potential success.
Or, if a car was equipped with technology that follows a driver’s gaze and expressions, according to el Kaliouby, it could tell the driver when they’re not paying attention to the road. Cars could start to prevent accidents before they happen simply by being aware of the driver’s state of mind and alerting them when they are distracted or drowsy.
From el Kaliouby’s perspective, the possibilities for the technology are endless. The longer she works on it, and the more she reflects on all the unspoken information that’s contained in our exchanges, the more she also wonders about important conversations in her own life. How many times, she muses, did she take people at their word when—if she’d been more aware of just how little language conveys—she could have understood that what they said and what they meant were two different things?
Among the endless possibilities for affective tech are dangerous ones, too, of course. In the wrong hands, a tool that reads and interprets human emotions could be used to discriminate, manipulate, and to profit from data about our sentiments.
El Kaliouby and her colleagues at Affectiva have vowed not to allow their tool to be used for security and surveillance purposes, for example. And their intentions have been tested—they’ve declined lucrative licensing deals on the basis of their principles, and she says that almost weekly Affectiva turns down investors interested in developing the technology for policing. She sees those companies’ argument—that by providing the security industry with tools to better understand humanity, she could help make the world safer—but el Kaliouby worries that there’s too much potential for abuse, especially if the tech isn’t sufficiently nuanced—an it’s not yet.
The technologist isn’t keeping secrets. She wants us all to be aware of the dangers of her work. She believes we need to think about how these tools are being developed and used—and what that means for the future. Because she’s confident that this is just the beginning of affective tech, and that it will inevitably affect us all when systems like the one she’s working on become integrated into the many devices we use. Tech ethics are not just a conversation for people in development but for everyone who ends up using products without always understanding the implications.
A treasure trove of data
First and foremost, el Kaliouby argues, users have to understand and consent to giving their facial data if such tools are to be used. Companies must be transparent about whether they are collecting the information and for what purposes—information that’s now offered in fine print could be made much more explicit. So, for example, in cars using Affectiva technology now, the facial data isn’t recorded. But arguably, if it was, insurers could start to subpoena records of expressions to determine accident liability. Police could use it for investigations.
A lot can be done with data. Not all of it is good. Tech meant to improve communication could be used for sinister ends, just as Facebook, a company with a mission to connect the world, was used to manipulate elections.
As companies increasingly gather information not just about what we buy or read or talk about, but how we wrinkle our noses, what makes us smile, when we furrow our brows, we are increasingly vulnerable. Businesses may end up knowing us better than we know ourselves, and that is problematic.
Collecting the whole world’s faces
Another potential pitfall of sentiment analysis technology is that it can be reductive. A nuanced tool must “get to know” the whole range of faces made in all the places across countless individuals to provide meaningful insight. Algorithms based on a limited data set are biased and recognize only the faces they have been exposed to repeatedly, which can mean that machines generate inaccurate or unjust information. To train a machine to read all the faces requires a lot of data collection from many people across many cultures, and it means understanding the range of expressions in various places.
The faces we make are, to a degree, determined by culture. El Kaliouby and her colleagues have found that there are universal expressions—smiles, frowns—but that cultural influences amplify or mute certain tendencies. They know, for example, that Latin Americans are more expressive than East Asians, and that around the world women generally smile more than men, el Kaliouby says.
But they still need a lot more information. She explains, “Progress is a function of how much data we can use and how diverse the data is. We want algorithms to be able to identify more expressions, emotions, genders, everything.” Until they can capture the whole range of human expression, there will always be limits to the tool’s powers of interpretation.
The holy grail
Then there’s what el Kaliouby calls “the holy grail” in her field: an algorithm that detects sarcasm.
Although it’s been accused of being the lowest form of wit, sarcasm, which uses tone to deliberately convey a contradictory message, is a very sophisticated type of messaging. Sarcasm is a tonal wink. And when a tool will understand this layered mode of communication, along with an actual wink, it will be considered a triumph of machine learning. But how it will know or show its understanding isn’t clear to humans yet.
Affectiva has been integrating speech tonality for the last two years and el Kaliouby is hesitant to guess how long it might take to reach the holy grail. But she says a tool that good—a technology that interprets both tone and expressions accurately, across all cultures and all personality types—is still a long way off.
What el Kaliouby is certain of, however, is that we should be wary of this work. She does it with love and good intentions, but that doesn’t necessarily mean we should just trust her.
“I think you should be a little scared,” she advises. “Any technology has potential for good and for abuse.”
Correction: A previous version of this post stated that Kia cars are equipped with Affectiva tech. They are not. The tool was on display in a concept car designed by Kia, however.
Should This Exist? is a podcast, hosted by Caterina Fake, that debates how emerging technologies will impact humanity. For more in-depth conversation on evaluating the human side of technology, subscribe to Should This Exist? on Apple Podcast.