Despite speech increasingly becoming one of the main ways people interact with devices, voice technology remains largely closed off to Africa’s languages, accents, and speech patterns. Case in point: The world’s most popular voice assistants, Siri, Alexa and Google Assistant, still don’t support any African languages. The continent has more than 1,000 languages.
Common Voice, a crowdsourcing project started by the Mozilla Foundation in 2017, has been addressing this by inviting speakers of African languages to donate their voices to a free and publicly available dataset that researchers and developers can use to train voice-enabled apps, products, and services.
“The idea was to diversify voice technology and democratize the space through an open source initiative,” Chenai Chair, special advisor for Africa innovation at the Mozilla Foundation, tells Quartz.
Common Voice has so far recorded more than 9,000 hours of voice in 90 languages from around the world, including three from Africa: Luganda (Uganda), Kabyle (Algeria) and Kinyarwanda (Rwanda). This week it announced an expansion of the project to Swahili, an east African language spoken by an estimated 100 million people, with the help of a $3.4 million investment from four organizations.
By making it easy to donate voice data in Swahili, Common Voice will support East Africans who are playing a direct role in creating technology that helps their communities, Chair says.
Mozilla says one of the main goals of the project is to evaluate the possibility of developing voice recognition for the languages of underserved communities. The open-source nature of the data could allow local innovators to develop products and services for marginalized communities, the company adds.
Already, the Kinyarwanda dataset, which has 1,800 hours of voice, is being used by a startup called Digital Umuganda to develop an AI chatbot, Mbaza, with speech-to-text and text-to-speech functionality that provides Covid-19 information in the language. The Kinyarwanda project by Common Voice is significant now because there’s a drive to digitize public services in the country, says Remy Muhire, community lead at the project.
While Africa is not a major sales market for the tech companies behind popular voice assistants, such as Apple, Amazon and Google, another challenge is that a lot of the voice data used to train machine-learning algorithms are held by a few big companies, making it difficult for others to develop high-quality speech recognition technologies. Research has shown that African languages are being left behind in voice recognition innovation as a result.
“Internationalization of all of our products and services is incredibly important,” a spokesperson for Amazon, the developer of Alexa, which supports at least eight languages, told Quartz. “It’s our vision for Alexa to be everywhere our customers are.” They declined to comment on whether they’re doing any work to include African languages.
Apple, the developer of Siri, which also supports at least eight languages, didn’t respond to a request for comment. Google Assistant supports nearly 30 languages, but none that are African, and the company also declined to comment.
Chair doesn’t think it’s right for giant international companies to dictate the use of language in technology. With Common Voice, she says, technologists can get data and build models and technologies that work for their communities.
“The guardianship should be with actual people who speak these languages,” she says.
Sign up to the Quartz Africa Weekly Brief here for news and analysis on African business, tech, and innovation in your inbox.