Google is all set to mine India’s vernacular languages.
On Aug. 28, the Mountain View, California-based internet giant launched Navlekha, a platform to help Indian language publishers get on the web. It will use artificial intelligence to render any PDF containing Indian language content into editable text. This will make it easy for “print publishers to create mobile-friendly web content,” the company said.
India has 22 official languages, but the content on the internet is predominantly in English. According to Google’s estimates, 90% of the country’s registered 135,000 publications don’t even have a website. For technology companies constantly looking to onboard more users from the hinterlands, making regional content available is crucial.
“The majority of the Internet users today are Indian language users, a number expected to reach 500 million plus in the next two years. 95% of video consumption is in vernacular languages,” Rajan Anandan, Google’s vice-president of India and southeast Asia said. He was launching Navlekha in New Delhi at the company’s annual flagship conference, Google for India, where it outlines its plans for the year.
Navlekha has already begun onboarding Hindi publishers from Delhi.
“Google has already penetrated urban India and has great adoption among India’s English-speaking population,” said Kartik Hosanagar, a professor of technology and digital business at the University of Pennsylvania’s Wharton School. “The recent announcement is a natural move to increase its user base in India. At the same time, it’s only one of a series Google will need if it hopes to penetrate the hundreds of millions of Indians.”
Google has been pushing for vernacular languages in India for some time now. In 2016, for instance, it launched the tap to translate programme that lets users translate an image or text to their own native language.
On Aug. 28, it also added a new feature to Google Go, its search app, that now lets users listen to web pages in 28 Indian languages.