When Amazon debuted the Amazon Echo in 2014, there were decidedly mixed reactions to the black, cylindrical Bluetooth speaker that could pick up voice commands. Few understood why the e-commerce giant had suddenly released a $199 speaker that could talk to you.
Today we know that the Echo and other devices Amazon has since released are mere vessels for the real star of the show: Alexa. The voice assistant is available in 15 languages and 80 countries and boasts more than 100,000 “skills,” compared to about a dozen five years ago. It can wake up your cat, serve as an interpreter, deter a burglar, help you work out, and streamline your workflow.
The company said in January that more than 100 million Alexa-enabed devices had been sold, among them 150-plus products Amazon itself offers and some 85,000 third-party products ranging from TVs to microwaves to washing machines.
In the early days of Alexa, the voice assistant was prone to errors. Some critics argue this still hasn’t changed. The United Nations found that voice assistants like Alexa (and Apple’s Siri) were prone to gender bias. One young Dallas girl accidentally ordered a bunch of dollhouses. Similar “Alexa” fails began cropping up on social media. A scene in the Jordan Peele horror movie Us, in which an Alexa-like voice assistant mistakes a dying character’s plea to call the police as a request to play music from the band the Police, further immortalized the ineptitude of the nascent technology.
But Amazon claims that Alexa has improved and is on its way to becoming better. This week Rohit Prasad, the head scientist for Alexa’s artificial intelligence, identified four areas of future growth: wake word detection, or how quickly the system wakes up following a voice prompt; automatic speech recognition (ASR), or how quickly it can convert audio streamed into words; natural language understanding (NLR), or how quickly it can extract meaning from language; and text-to-speech synthesis.
According to Prasad, efforts by the AI team to improve Alexa’s voice recognition have paid off, and error rates have declined:
In both wake word and ASR, we’ve seen fourfold reductions in recognition errors. In NLU, the error reduction has been threefold—even though the range of utterances that NLU processes, and the range of actions Alexa can take, have both increased dramatically. And in listener studies…we’ve seen an 80% reduction in the naturalness gap between Alexa’s speech and human speech.
For Amazon, perfecting Alexa’s voice recognition is necessary as people begin to use it for an ever increasing number of tasks. Currently, a quarter of the US adult population owns a smart speaker, according to a March report by Voicebot.ai and Voicify. Alexa appears to dominate the US market, beating out Google Home, at least for now. (Globally, Google Home beats Alexa, with about 32% market share.)
Amazon and third-party developers have managed to come up with an Alexa-enabled answer to nearly every aspect of the average person’s lifestyle. That makes the numerous privacy concerns over the technology all the more concerning. The scientific arm of Germany’s parliament released a report this summer concluding there’s evidence Alexa is listening when it shouldn’t be. Amazon faced backlash after Bloomberg reported that it had hired thousands of human reviewers to listen to Alexa recordings in order to perfect its voice-recognition technology.
None of this seems to be slowing Amazon down. At its hardware event in September, the company announced a bevy of new and upcoming Alexa-enabled products, including smart glasses, earbuds, an alarm clock, and a “smart ring” (the Echo Loop) that will let you call on Alexa whenever and wherever you desire. And just in time for the holidays, Amazon today released an Alexa-enabled Christmas tree that’s already sold out.
Given that few saw such products coming five years ago, it’s fair to say we won’t know what to expect with Alexa five years from now.