The Linguistic Divide is bigger than the Digital Divide but the digital infrastructure is responsible for bridging the gap to create a bigger impact. India is a country with around 425 different languages and dialects. India also has a huge base of internet subscribers and the lack of content in local languages means that users’ experience is highly impacted. Aligning with this, recently Google has announced the addition of eight Indian languages including Sanskrit to Google Translate, with the view to increase the number of regional languages supported by its online multilingual translation service.
It is worth highlighting that apart from Sanskrit, the other Indian languages in the latest iteration of Google Translate are Assamese, Bhojpuri, Dogri, Konkani, Maithili, and Meiteilon (Manipuri), Mizo and Sanskrit.
This announcement came during the annual Google conference I/O that began late on Wednesday (11th May 2022) night.
The latest update does not cover all the 22 scheduled languages of India, but the company is working on significantly closing the gap. Another highlight of this update is that all the languages added in this update will only be supported in the text translation feature for now, but the company will be working on rolling out voice to text, camera mode, and other features too.
This update is part of a bigger plan wherein Google will be adding 24 Indian languages to Google Translate, which now supports a total of 133 languages used around the globe.
In terms of the technology, these are the first languages that have been added using the zero-shot machine translation, where a machine learning model only sees monolingual text, meaning it learns to translate it into another language without ever seeing an example. Although Google says that this technology is not perfect but the company is working towards improving it.
List of the languages added
Do you know that more than 300 million people use the newly added languages? For example – Mizo is spoken by about 8,00,000 people in Northeast India and Lingala is spoken by more than 45 million people across Central Africa.
This is a complete list of the new languages now available in Google Translate:
Assamese: Used by about 25 million people in Northeast India
Aymara: Used by about two million people in Bolivia, Chile and Peru
Bambara: Used by about 14 million people in Mali
Bhojpuri: Used by about 50 million people in northern India, Nepal and Fiji
Dhivehi: Used by about 300,000 people in the Maldives
Dogri: Used by about three million people in northern India
Ewe: Used by about seven million people in Ghana and Togo
Guarani: Used by about seven million people in Paraguay and Bolivia, Argentina and Brazil
Ilocano: Used by about 10 million people in the northern Philippines
Konkani: Used by about two million people in Central India
Krio: Used by about four million people in Sierra Leone
Kurdish (Sorani): Used by about eight million people, mostly in Iraq
Lingala: Used by about 45 million people in the Democratic Republic of the Congo, Republic of the Congo, Central African Republic, Angola and the Republic of South Sudan
Luganda: Used by about 20 million people in Uganda and Rwanda
Maithili: Used by about 34 million people in northern India
Meiteilon (Manipuri): Used by about two million people in Northeast India
Mizo: Used by about 830,000 people in Northeast India
Oromo: Used by about 37 million people in Ethiopia and Kenya
Quechua: Used by about 10 million people in Peru, Bolivia, Ecuador and surrounding countries
Sanskrit: Used by about 20,000 people in India
Sepedi: Used by about 14 million people in South Africa
Tigrinya: Used by about eight million people in Eritrea and Ethiopia
Tsonga: Used by about seven million people in Eswatini, Mozambique, South Africa and Zimbabwe
Twi: Used by about 11 million people in Ghana