In spite of many languages being spoken in India, it is difficult for the people to understand Indian regional languages like English, Gujrati, Kannada, Tamil, Telugu, Punjabi, Malayalam, etc. The recognition and synthesis of speech are prominent emerging technologies in natural language processing and communication domains. This paper aims to leverage the open-source applications of these technologies, machine translation, text-to-speech system (TTS), and speech-to-text system (STT) to convert available online resources to Indian languages. This application takes an English language video as an input and separates the audio from video. It then divides the audio file into several smaller chunks based on the timestamps. These audio chunks are then individually converted into text using WhisperAI speech recognition model.And facebooks M2M100 model for translation. After this translation, a TTS system is required to convert the text into the desired audio output. Not many open source TTS systems are available for Indian regional languages. This application is beneficial to visually impaired people as well as individuals who are not capable of reading text to acquire knowledge in their native language. In future, this application aims to achieve ubiquitous communication enabling people of different regions to communicate with each other breaking the language barriers. Index Terms- Translation of English to Regional indian languages, ,translation and transcription using WhisperAI and FacebookM2M100, Jupiter Notebook
Read full abstract