Alexa is very confused by little kids

Kids love to play with Alexa. Voice-activated assistants are a bottomless source of entertainment, offering knock-knock jokes, bedtimes stories, and animal facts on demand. There’s just one problem: Alexa and her smart-speaker ilk don’t always know how to come out to play.

ByZach Wener-Fligner, Annabelle Timsit and Annabelle Timsit

Kids love to play with Alexa. Voice-activated assistants are a bottomless source of entertainment, offering knock-knock jokes, bedtimes stories, and animal facts on demand. There’s just one problem: Alexa and her smart-speaker ilk don’t always know how to come out to play.

Up to around age five, children are still learning how to talk—a complicated process that involves developing the skills they need for clear pronunciation, learning the linguistic patterns of their first language, and expanding their vocabulary. That’s why, when young kids try to talk to adults, they often have what’s known as “communication breakdowns”: The adult isn’t sure what the kid is trying to say, or misunderstands one word as something else. But because kids are persistent, they usually try and try again to get their point across, in what’s known as “communication repair.”

Smart speakers may falter in the face of such communication breakdowns, as highlighted by a new study, conducted by a team of researchers from the University of Washington and published in the proceedings of the 17th ACM Conference on Interaction Design and Children, which took place in June 2018 in Norway. And it’s part of a larger pattern with voice assistants, including Amazon’s Alexa, Google Home, and Apple’s Siri: While all kinds of people are using them, smart speakers sometimes run into trouble with interpretation.

How do smart speakers respond when young kids talk to them?

Smart speakers tend to work best for a certain profile of consumer, as the Washington Post reported in a wide-ranging investigation of smart speakers’ bias problem. That consumer, broadly speaking, is white, well-educated, and a native English speaker from the West Coast—in other words, someone quite similar to the majority-white engineers and designers who create the voice assistants. That leaves many people behind, including people with regional accents and Americans whose first language is not English.

It may also exclude young children—sometimes with disastrous results.

Consider the admittedly small and homogenous study, which focused on 14 mostly white, middle-class families in the US with children between the ages of three to five years old. The study was conducted as part of a larger evaluation of a tablet game called Cookie Monster’s Challenge, which was created by Sesame Workshop and PBS Kids to support the development of self-regulation and executive function in young children.

The researchers gave the game to the families to play with over the course of two weeks. But what the team didn’t predict was a bug in their study design. Instead of working as it was supposed to, one voice-driven miniature game embedded within the larger app broke. When children tried to get the duck in the game to quack, the virtual assistant would respond, “I’m sorry, I didn’t quite get that.” The speakerphone recorded 107 audio samples of children attempting to engage with the voice-driven interface, yielding what the study calls “a data set of children’s attempts to repair an irreparable conversation.”

“This is the first study I’ve ever done by accident,” study co-author Alexis Hiniker tells Quartz. Her team decided to turn their study into an investigation of what happens when kids fail to get their point across to devices like Alexa. This allowed them to study parents’ and children’s reactions to encountering a broken user interface and their strategies for conversation recovery. The most popular strategy was repetition, which a majority of kids (79%) used. The second-most common strategy, which 63% of kids used, was moderating the volume of their voice by speaking louder into the microphone. And the third most common strategy was choosing different words or sentence structures in the hopes that the machine would better understand them.

All of the kids were persistent, rarely giving up on the interaction, asking for help, or showing frustration with the broken game. In fact, it was usually the parents—not the children—who decided to give up on the game.

The mishap exposed the way that voice-driven interfaces like Alexa can improve in supporting children through their communication failures. “There has to be more than ‘I’m sorry, I didn’t quite get that,'” Hiniker says in a press release.

The problem doesn’t just apply to kids; adults have trouble communicating all the time. Think of how many times you misspeak or stumble in a given conversation. Adults correct these types of disfluencies in their speech all the time, Hiniker says, in ways that are extremely sophisticated. For example, they might engage in what’s known as “other-initiated communication repair,” which basically means picking up on the queues of the verbal and nonverbal of the person listening to you, and updating what you’re saying in a way that gets your point across better. That’s a purely human social skill, “and that is not something that happens with any commercially-available voice-interface right now,” according to Hiniker.

When reached for comment, a spokesperson from Google notes, “We are continually working on ways in which the Assistant can better understand and communicate.” She pointed specifically to the company’s efforts to improve its algorithms for all users, which includes children, and to collect audio from kids through its third-party vendor and use the data to train speech models to better understand them. Amazon has not yet responded to a request for comment. But it did release an Echo Dot specifically aimed at kids earlier this year, which according to Mat Honan of BuzzFeed News is “more forgiving of the ways kids may speak to it — a less clearly pronounced ‘Awexa,’ for example, should still wake it up.”

How else can smart speakers be better at helping support kids’ early learning? The study says that “an interface that is backed by artificial intelligence, has a user-specific understanding of the child, engages in conversation, or provides access to a variety of functions and information sources” could be a good start. Writing about Alexa’s trouble understanding young kids for the MIT Technology Review, Rachel Metz says that researchers also suggest that “Alexa and similar ‘agents’ could be designed to tell you why they don’t understand what you’re asking or commanding, so you can better determine how to get what you want.”

When asked whether smart speakers could one day become active participants in children’s language development, Hiniker says she wouldn’t rule it out–but we’re far from that reality. “It’s hard for me to imagine that these kinds of devices are ever as sophisticated as the linguistic sophistication of what parents and other people in children’s lives provide,” she says. “But who knows, maybe that’s a failure of imagination.”

The risks of smart speakers

Some parents may not be so wild about the idea of smart speakers helping their kids learn to communicate. A common fear is that kids will use Alexa and other smart speakers as a substitute for real, human connection. Some parents argue that voice assistants are ushering in an age of casual rudeness. According to experts, they could pose a risk to children’s normal cognitive and emotional development because they change the ways in which kids consume information and build knowledge. The speakers fail to challenge kids in the type of back-an-forth, “serve-and-return” interactions that help them learn; instead, they just give them all the information they need, and sometimes more.

There are privacy concerns involved with some of these devices, too. That’s why, in 2017, Mattel announced that it was canceling plans to produce a child-specific smart speaker called Aristotle, which would have had near-constant access, both by microphone and video feed, to a child’s life. Observers also balked at the idea of a device virtually designed to replace parents’ roles in their kids’ development. In a letter addressed to Mattel in September 2017, senator Ed Markey of Massachusetts and House representative Joe Barton of Texas wrote, “It appears that never before has a device had the capability to so intimately look into the life of a child.”

Hiniker thinks focusing on those potential pitfalls is missing the point. Whether we want it to or not, the smart speaker revolution has arrived: According to a study conducted by NPR and Edison Research (pdf), one in six Americans 18 years old and up now own a smart speaker, a figure that’s up 128% from January 2017. They’re even forecasted to outnumber humans one day. So it makes sense to find ways to make Alexa and her peers benefit the millions of kids who interact with them every day. “I definitely understand some of those concerns,” she explains. “At the same time, kids are already using them, they’re just using a version that wasn’t designed in them in mind.”

Read more from our series on Rewiring Childhood. This reporting is part of a series supported by a grant from the Bernard van Leer Foundation. The author’s views are not necessarily those of the Bernard van Leer Foundation.

Alexa is very confused by little kids

Kids love to play with Alexa. Voice-activated assistants are a bottomless source of entertainment, offering knock-knock jokes, bedtimes stories, and animal facts on demand. There’s just one problem: Alexa and her smart-speaker ilk don’t always know how to come out to play.

How do smart speakers respond when young kids talk to them?

The risks of smart speakers

📬 Sign up for the Daily Brief

Our free, fast and fun briefing on the global economy, delivered every weekday morning.