When internet connectivity is discussed in Africa, the conversation usually revolves around those who have access and those who don’t, namely the digital divide. But there’s also a language divide, a gap that’s been widening especially when it comes to voice recognition technology, a new study says.
As voice-based interfaces like Amazon’s Alexa, Apple’s Siri, and Google’s Home assistant become ubiquitous, more people are using smart speakers to shop, set reminders, and get answers to simple but essential questions like the weather. Research shows half of all searches will be voice-based by 2020, and this massive pivot towards voice commands is set to create an entire ecosystem of applications and interactions.
Yet research by the early-stage accelerator Digital Financial Services Lab and research consultancy Caribou Digital shows developers are focusing much of their efforts on improving English language capabilities and less on languages from developing nations in Africa and Asia. African languages are already disadvantaged online with huge platforms including Twitter and Google AdSense not supporting any African languages.
And even though companies have made enormous advances in Natural Language Processing (NLP), or the ability of computers to comprehend the human voice and language, there’s an emerging divide that relegates low-income populations and less widely-spoken languages to the background. As a result, this disparity will not only create a gap between those who can use this form of artificial intelligence for communications and those who can’t but also hinder users’ abilities to exploit these applications for development interventions like healthcare and finance.
One factor influencing this skewed attention is profit, with NLP providers concerned with languages that could help them make money. By multiplying the number of speakers of a language by the gross domestic product per capita, the authors found the top 100 languages cover approximately 96% of global GDP. Yet these 100 languages comprised less than 60% of all populations, highlighting “a fundamental tension between the commercial and social value of languages.”
African languages also lack enough data to train machine learning systems, disincentivizing researchers to start from scratch. Even though languages like Swahili are spoken by almost 100 million people, there isn’t a well-known repository that can be used to feed speech recognition software. The study notes that building accurate machine translation services requires approximately 100,000 hours of recorded speech.
And while the rapid growth of messaging apps has increased the amount of content available to researchers, much of it is generated in “dark social” apps like WhatsApp and Facebook Messenger where information shared cannot be measured or used as a dataset to build sophisticated systems.
The multilingual nature of modern Africans also means developers aren’t processing mixed language models. Code-switching is inherent in many countries, with users either speaking or writing in two or three languages in one post or conversation. An example of this is the combination of English and Swahili in Kenya’s Sheng, the mixing of French with Arabic or Berber in Algeria, or even the Romanization of Amharic.
These shortcomings, however, have not stopped start-ups from innovating around voice services for African customers. Farm.ink has developed chatbots that allow farmers in Kenya to receive information based on user-generated posts. The fintech company Teller is also integrating financial services into the messaging experience for customers in Madagascar and West Africa.
These examples are relatively nascent and are still biased towards English and French, respectively. But they do offer an illustration of how voice technology could fundamentally alter the way Africans access the internet.
Sign up to the Quartz Africa Weekly Brief here for news and analysis on African business, tech and innovation in your inbox