The Indian arm of Google DeepMind is developing AI technologies that can understand and act on the dialects of 125 different Indian languages. A news package about it.
In 2010, Google created Google DeepMind, an artificial intelligence research company. AlphaGo was first popularized by Google’s successful use of AI in the Go game industry.
Google is constantly focusing on AI technology that understands different languages. Accordingly, Google DeepMind has launched a new project in collaboration with Indian Institute of Science and Artificial Intelligence & Robotics Technology Park ARTPARK.
The Vani project was launched in December 2022 with the goal of collecting and transcribing 1,54,000 hours of speech data from 773 districts in India.
The aim of the project is to develop an AI model that understands and represents 125 Indian languages and dialects.
India officially recognizes 22 languages. However, in many places in India many languages are spoken which are not officially recognized. 60 of these languages are spoken by over 100 crore people of India.
Many of the languages spoken in India are largely unknown. For example, Hindi, spoken by nearly 10 percent of the world’s population, accounts for only 0.1 percent of Internet content.
Manish Gupta, director of Google DeepMind, who says that 73 of the 125 languages spoken in India do not have any digital data, said that the purpose of the Vani project is to collect speech data from across the country for languages that do not have adequate linguistic data.
Through this, in the first phase, 14,000 hours of speech data in 58 languages have been collected from 80,000 people in 80 districts.
Also the Vani project is expanding the collection of this speech data in about 160 districts across all states of India.
According to Google, this massive data collection effort is key to developing an AI that truly reflects India’s linguistic diversity.
Vani Project ensures that every language spoken by the people of India has a place in the digital age. Also, this project helps preserve the country’s rich linguistic heritage.
Discussion about this post