Do machines understand Africa? This is the question Meta started answering this month when it published a research paper (pdf) that details plans to improve the accuracy with which AI algorithms decode African languages.
The plan, expected to look into 55 of Africa’s marginalized languages and improve how AI machines translate them on Facebook, Instagram, and Wikipedia, could boost technological inclusion in the creation and adoption of tech solutions for Africa.
Chief executive Mark Zuckerberg published on his Facebook profile that his company will be using a supercomputer to lead the translations through advanced Natural Language Processing (NLP) capabilities.
“We also work with professional translators to do human evaluation too, meaning people who speak the languages natively evaluate what the AI produced. The reality is that a handful of languages dominate the web, so only a fraction of the world can access content and contribute to the web in their own language. We want to change this,” he explains.
Meta: No language left behind
Currently, about 4 billion people are locked out of internet services because their languages are marginalized and do not speak one of the few languages content is available in. Sub-Saharan Africa accounts for 13.5% of the global population but less than 1% of global research output largely due to language barriers.
“AI models require lots and lots of data to help them learn, and there’s not a lot of human translated training data for these languages. For example, there’s more than 20 million people who speak and write in Luganda but examples of this written language are extremely difficult to find on the internet,” Meta says in the paper.
The company hopes human translators will help it develop a reliable benchmark which can automatically assess translation quality for many low-resource and marginalized languages.
The open-sourced AI model will also translate 145 more languages across the world which aren’t supported by current translation systems.
“We call this project No Language Left Behind. To give a sense of the scale, the 200-language model has over 50 billion parameters, and we trained it using our new Research SuperCluster, which is one of the world’s fastest AI supercomputers,” Zuckerberg says.
That will enable more than 25 billion translations every day across Meta’s apps, and will help Meta’s AI machines show the most interesting content on social media in local languages and recommend more relevant ads.
Breakthrough for social media platforms
This could also be a breakthrough in areas that social media platforms such as TikTok, YouTube, Facebook, and WhatsApp have struggled with such as taming political hate speech, bullying, body shaming, disinformation, human trafficking, sexual exploitation, and fake news in Africa as videos, text, and audio are published in vernacular languages.
The lack of high-quality translation tools for hundreds of languages in Africa means millions of people today can’t access digital content or participate fully in online conversations or digital marketplaces in their preferred native languages.
“Africa is a continent with very high linguistic diversity, and language barriers exist day to day. In the future, imagine visiting your favorite Facebook group, coming across a post in Igbo or Luganda, and being able to understand it in your own language with just a click of a button,” says Balkissa Ide Siddo, public policy director for Africa at Meta.
In May this year, Google added 10 new African languages to its Google Translate tool, even as Africans keep on fixing Wikipedia’s language problem.