At Google’s annual developer conference this past week, CEO Sundar Pichai played a clip of what seemed to be a mundane human interaction: Someone calling to make an appointment with their hairdresser. The two voices negotiated the date and time, with the “assistant” providing the client’s name for the reservation.
But it wasn’t two humans conversing. Instead, Pichai said, it was Google’s artificial intelligence making the call to the unwitting hair salon receptionist. The technology is called Google Duplex, a system that combines natural language processing and speech generation that allegedly allows Google to accomplish customer service tasks in a number of limited situations. Right now, the company is only testing the technology internally, focusing on booking reservations at restaurants and hair salons, as well as inquiring about holiday hours for businesses.
The feature raises serious ethical and societal questions, like whether the search giant has the obligation to disclose that human-like voice on the phone is actually a bot, or how to avoid barraging popular businesses with spammy calls. But Duplex also brings Google’s core mission of organizing the world’s information into the physical world in a whole new way. The basic principle is the same as an internet search: The algorithm requests some information, and brings it back to a user. The only difference is whom Google is asking—a website or a person.
And in typical tech fashion, no data is wasted—the answers about public information like holiday hours can then be added to a business’s Google result.
Pursuing human-machine interaction like this solves a serious problem for Google. If the company’s mission is to catalog all possible information and make it easy to find, it’s inefficient to wait for humans to upload that knowledge to the internet. Google’s continuous effort to map the world with its 360-degree cameras is similar—if you want to know something about the physical world, go out there and figure it out yourself.
The technology shouldn’t come as a surprise to those following Google’s AI efforts closely. The company has regularly published research on new ways for machines to understand the messy and confusing ways humans speak, as well as work on how to generate convincing human voices.
Many observers noted the human-like “ums” Duplex inserted into its speech to cover for the time it needed to generate the appropriate responses. Alex Rudnicky, a professor at Carnegie Mellon and director of the university’s Speech Consortium, says that technique has been proposed before, but what Duplex did really well was imitate the little conversation cues that humans unconsciously use to signal information is new or important. Gone was the stilted monotone we typically associate with virtual assistants.
“Just before a new piece of information was introduced, like a name or a time, there would be a pause or a ‘uh huh’ or point in time where the listener’s attention could be focused on what was happening next,” Rudnicky said.
This success comes with a new burden, which manifested in the backlash Google faced in not disclosing that the caller wasn’t human. Academics and industry leaders, such as the former president of the Association for the Advancement of Artificial Intelligence Thomas Dietterich, responded to the demo saying that it should be a person’s right to know whether they’re speaking to a human or a machine.
Rudnicky agrees that Google should warn people if they’re talking to a machine, even suggesting that it could be better for the search company in the long run. He cites the phenomenon that we talk slower and more clearly when we know we’re talking to a machine—and all that means is better data for Google.
While the Duplex caller is still in the early stages of testing—and Google has said the final product will inform those it calls that they’re speaking with a bot—Pichai’s on-stage demonstration was a show of force.
In Duplex, Google has revealed the killer app for the voice assistant, something that finally goes beyond the main uses of playing music, setting a timer, and getting the weather without having to look at your phone. Saying “OK Google” carries much more weight when you’re directing it to act on your behalf. While problems like booking a restaurant are rather trivial, it’s not hard to imagine scheduling meetings with the assistant or waiting on the phone with a cable company to upgrade services. Instead of voicemail, maybe we’ll automatically reach the person’s assistant.
This was Google announcing that it was nearly ready to not only act as your digital assistant, but your proxy in the real world. Now the question is even more stark: How much more of our lives are we willing to hand over to Google?